DevOps Zone is brought to you in partnership with:

Mark is a graph advocate and field engineer for Neo Technology, the company behind the Neo4j graph database. As a field engineer, Mark helps customers embrace graph data and Neo4j building sophisticated solutions to challenging data problems. When he's not with customers Mark is a developer on Neo4j and writes his experiences of being a graphista on a popular blog at http://markhneedham.com/blog. He tweets at @markhneedham. Mark is a DZone MVB and is not an employee of DZone and has posted 544 posts at DZone. You can read more from them at their website. View Full User Profile

Sed: Replacing Characters with a New Line

01.03.2013
| 5399 views |
  • submit to reddit

I’ve been playing around with writing some algorithms in both Ruby and Haskell and the latter wasn’t giving the correct result so I wanted to output an intermediate state of the two programs and compare them.

I didn’t do any fancy formatting of the output from either program so I had the raw data structures in text files which I needed to transform so that they were comparable.

The main thing I wanted to do was get each of the elements of the collection onto their own line. The output of one of the programs looked like this:

[(1,2), (3,4)…]

To get each of the elements onto a new line my first step was to replace every occurrence of ‘, (‘ with ‘\n(‘. I initially tried using sed to do that:

sed -E -e 's/, \(/\\n(/g' ruby_union.txt

All that did was insert the string value ‘\n’ rather than the new line character.

I’ve come across similar problems before and I usually just use tr but in this case it doesn’t work very well because we’re replacing more than just a single character.

I came across this thread on Linux Questions which gives a couple of ways that we can get see to do what we want.

The first suggestion is that we should use a back slash followed by the enter key while writing our sed expression where we want the new line to be and then continue writing the rest of the expression.

We therefore end up with the following:

sed -E -e "s/,\(/\
/g" ruby_union.txt

This approach works but it’s a bit annoying as you need to delete the rest of the expression so that the enter works correctly.

An alternative is to make use of echo with the ‘-e’ flag which allows us to output a new line. Usuallybackslashed characters aren’t interpreted and so you end up with a literal representation. e.g.

$ echo "mark\r\nneedham"
mark\r\nneedham
 
$ echo -e "mark\r\nneedham"
mark
needham

We therefore end up with this:

sed -E -e "s/, \(/\\`echo -e '\n\r'`/g" ruby_union.txt

** Update **

It was pointed out in the comments that this final version of the sed statement doesn't actually lead to a very nice output which is because I left out the other commands I passed to it which get rid of extra brackets.

The following gives a cleaner output:

$ echo "[(1,2), (3,4), (5,6)]" | sed -E -e "s/, \(/\\`echo -e '\n\r'`/g" -e 's/\[|]|\)|\(//g'
1,2
3,4
5,6

Published at DZone with permission of Mark Needham, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)