About sed
What is sed?
— WikiWikiWeb: Sed Language
- sed - a Stream EDitor
- sed is a UNIX utility which processes a text file one line at a time. It has RegularExpression based string manipulation, a hold buffer, and some basic flow control. Amazing things can be done with these BearSkinsAndStoneKnives.
Yes, I guess sed
is a stone knife or bear-skin, in that it's one of those ancient1 Unix utilities with a reputation for being powerful in the right context, but a bit difficult to wield. Whether or not the reputation is justified, it's good to know a little about sed
because it's everywhere in Unix/Linux, and it proves useful in many situations.
My main use case for sed
is when I find myself thinking, "Gee, I just need to run a regular expression over this file or bunch of text."
In addition to replacing, it can delete certain text or lines, or insert blank lines. The way sed
changes files is called non-interactive editing: all change instructions are defined up front, and then applies them to the input, line by line.
This makes it suited to be part of shell scripts or other automated workflows, and handy for one-time changes such as data cleanup.
How to learn sed
Check out Sed Examples by Sasikala for a nice overview of sed
features and uses. Each of the posts contains examples organized by feature and command, which makes it easy to find something specific to the task at hand.
There's a Digitalocean tutorial: The Basics of Using the Sed Stream Editor to Manipulate Text in Linux. However, it's my opinion that some of the best sed
info is found on websites of a certain vintage:
- Sed - An Introduction and Tutorial by Bruce Barnett for plenty more instructive examples.
- the sed $HOME is a cornucopia of examples and one-liners, but also has lots of awesome
sed
scripts, including tic-tac-toe and other games! man sed
as well -- it's a goodman
page,sed
.
Example usage
Here's a simple example for replacing a certain string value in some data. Given a text file:
> cat trends.txt
old and busted fashions
old and busted hats
music that is old and busted
all old and busted everything
Perform the substitution on all lines of the original file with command s
, and pipe the output to a new file:
> sed 's/old and busted/new hotness/' trends.txt > new-trends.txt
The new file looks exactly like the original file, but with our phrase replaced:
> cat new-trends.txt
new hotness fashions
new hotness hats
music that is new hotness
all new hotness everything
Case study
I encountered some wild Pokemon data that needed a bit of cleanup:
{
"id": "040",
"name": "Wigglytuff",
"img": "http://img.pokemondb.net/artwork/wigglytuff.jpg",
"type": ["Normal"],
"stats": {
"hp": "140",
"attack": "70",
"defense": 45,
"spattack": "75",
"spdefense": "50",
"speed": 45
},
"moves": {
// ...
}
// ...
}
As the sample shows, the property values in stats
are formatted as strings, and some are numbers. It's not only defense
and speed
, but all of the stats are formatted inconsistently throughout the file, which makes the data less easy to use. It's possible the database or import script could coerce the values for us, but let's make changes to the source file itself so that the data will be consistent no matter how we use it.
Here is the incantation to sed
:
> sed -E '/[[:space:]]*("id"|"height")/!s/"([[:digit:]]+\.*[[:digit:]]*)"/\1/'
- The RegEx part before
!
tellssed
to ignore a line if it has either" id"
or" height"
[where there is a space character before the attribute] - This is because
id
s andheight
s have numbers in their values, but the nature of the data indicates that they should remain formatted as strings. - The substitution command
s/"( ...symbols... )"/\1/
does this: - Match on patterns that look like numbers inside quotation marks
- Group the part inside the quotation marks with parentheses
- Replace with the matched group
- Examples:
"7.25"
becomes7.25
,"10"
becomes10
- The flag
-E
is for modern/extended RegEx format
In practice, the full command would also indicate the original and output filenames, as seen in the simple replace example earlier in this post. The end result is that all number values are represented as Numbers, not Strings, except for those properties where string representation is appropriate for number-like values.
Summary
Bottom line: it's good to know about sed, what it does, and how it can be applied to everyday problems. It's a widely available utility, worth keeping in your Unix toolbox, even if it appears at first to have all the user-friendliness of an old flint blade.
Links
- WikiWikiWeb: Sed Language
- Sed Examples by Sasikala
- The Basics of Using the Sed Stream Editor to Manipulate Text in Linux
- Sed - An Introduction and Tutorial by Bruce Barnett
- the sed $HOME
-
sed
is older than I am, so that's prehistoric from my perspective. If you originated in the 80s, then it predates you, too:"sed" stands for Stream EDitor. Sed is a non-interactive editor, written by the late Lee E. McMahon in 1973 or 1974. A brief history of sed's origins may be found in an early history of the Unix tools, at http://www.columbia.edu/~rh120/ch106.x09.
Kentucky seal generator
I made this app that renders a Seal of the Commonwealth of Kentucky in SVG. The motto can be updated, and the generated Seal can be saved in PNG format. These notes cover a few implementation details for how to edit and save an SVG image with JavaScript.
View the Kentucky seal generator app.
ThankSVGng
This project really got off the ground when I found this SVG Seal of Kentucky. The image is in the public domain, so I downloaded it and removed the original motto text shapes. Thanks, Wikimedia Commons!
Next the SVG gets some new nodes to contain the custom motto. The attributes text-anchor="middle"
and startOffset="50%"
together keep the motto where it should be centered on the paths:
<svg>
<defs>
<path id="p1" d="M 175 331 A 145 145 0 1 1 488 331" />
<path id="p2" d="M 168 331 A 132 132 0 1 0 495 331" />
</defs>
<text text-anchor="middle">
<textPath id="motto_top" xlink:href="#p1" startOffset="50%">
UNITED WE STAND
</textPath>
</text>
<text text-anchor="middle">
<textPath id="motto_btm" xlink:href="#p2" startOffset="50%">
DIVIDED WE FALL
</textPath>
</text>
</svg>
The other important textPath
attributes are its id
, used by Snap to update the motto contents, and xlink:href
, which links each textPath
to its path definition.
The two path
elements define arc-shaped paths for the motto to follow, basically two semicircles, as shown here:
Changing the motto
Snap.svg will help display the seal and edit its motto. The first task for Snap is to load the blank Seal SVG from an external file. Contents will be added to the empty SVG element on the HTML page with id svg_home
:
<svg
id="svg_home"
width="100%"
height="100%"
viewBox="0 0 662 662"
preserveAspectRatio="xMinYMin meet"
>
<!-- contents of the blank seal svg loaded here -->
</svg>
Snap.load("seal.svg", function(data) {
// call Snap.add(data) on the outer SVG element
})
After the image is loaded, Snap wraps the SVG element and allows the textPath
nodes to be identified:
let s = Snap("#svg_home")
let topText = s.select("#motto_top")
let btmText = s.select("#motto_btm")
Finally, the motto can be updated by setting #text
attributes with Snap, suitable for wiring to an input event listener function:
function updateOnInput(topValue, btmValue) {
topText.attr({ "#text": topValue })
btmText.attr({ "#text": btmValue })
}
Save as...
Setting up the motto live-editing was easy with Snap, but saving the result isn't as straightforward. Even if you get the source of the SVG, it's still in SVG format, which isn't the nicest for viewing and sharing.
Fortunately, we can rely on browser APIs and the SVG.toDataURL
and canvg
libraries for SVG-to-PNG conversion without external tools or server requests.
When the 'Get image' button is clicked, this code inside the click event callback gets the SVG data, following from the example documentation for SVG.toDataURL
:
sealEl.node.toDataURL("image/png", {
callback: function(data) {
// set the image source to data and
// do some other stuff to handle display
// of the image at this time.
},
})
Also note that Snap's Element.node
is used to identify the seal SVG (in this case, a nested SVG element) and get a reference to the DOM object to convert with toDataURL
.
Summary
Looking back, the uws-dwf
app evolved in two main phases:
-
Prototype of SVG-changing with Snap to check how easy it would be.
- result: easy indeed, once you know how and where to draw the paths.
- Integrating other libraries to handle PNG conversion.
Not much else to it. #simplegifts