XML Output (-oX
)
XML, the extensible markup language, has its share of critics as well as plenty of zealous proponents. I was long in the former group, and only grudgingly incorporated XML into Nmap after volunteers performed most of the work. Since then, I have learned to appreciate the power and flexibility that XML offers, and even wrote this book in the DocBook XML format. I strongly recommend that programmers interact with Nmap through the XML interface rather than trying to parse the normal, interactive, or grepable output. The XML format includes more information than the others and is extensible enough that new features can be added without breaking existing programs that use it. It can be parsed by standard XML parsers, which are available for all popular programming languages, usually for free. Editors, validators, transformation systems, and many other applications already know how to handle the format. Normal and interactive output, on the other hand, are custom to Nmap and subject to regular changes as I strive for a clearer presentation to end users. Grepable output is also Nmap-specific and tougher to extend than XML. It is considered deprecated, and many Nmap features such as MAC address detection are not presented in this output format.
An example of Nmap XML output is shown in Example 13.9. Whitespace has been adjusted for
readability. In this case, XML was sent to
stdout
thanks to the -oX -
construct.
Some programs executing
Nmap opt to read the output that way, while others specify that output
be sent to a filename and then they read that file after Nmap completes.
# nmap -T4 -A -p 1-1000 -oX - scanme.nmap.org
<?xml version="1.0"?>
<?xml-stylesheet href="file:///usr/local/bin/../share/nmap/nmap.xsl" type="text/xsl"?>
<!-- Nmap 5.59BETA3 scan initiated Fri Sep 9 18:33:41 2011 as:
nmap -T4 -A -p 1-1000 -oX - scanme.nmap.org -->
<nmaprun scanner="nmap" args="nmap -T4 -A -p 1-1000 -oX - scanme.nmap.org" start="1315618421"
startstr="Fri Sep 9 18:33:41 2011" version="5.59BETA3" xmloutputversion="1.03">
<scaninfo type="syn" protocol="tcp" numservices="1000" services="1-1000"/>
<verbose level="0"/>
<debugging level="0"/>
<host starttime="1315618421" endtime="1315618434">
<status state="up" reason="echo-reply"/>
<address addr="74.207.244.221" addrtype="ipv4"/>
<hostnames>
<hostname name="scanme.nmap.org" type="user"/>
<hostname name="li86-221.members.linode.com" type="PTR"/>
</hostnames>
<ports>
<extraports state="closed" count="997">
<extrareasons reason="resets" count="997"/>
</extraports>
<port protocol="tcp" portid="22">
<state state="open" reason="syn-ack" reason_ttl="53"/>
<service name="ssh" product="OpenSSH" version="5.3p1 Debian 3ubuntu7"
extrainfo="protocol 2.0" ostype="Linux" method="probed" conf="10">
<cpe>cpe:/a:openbsd:openssh:5.3p1</cpe>
<cpe>cpe:/o:linux:kernel</cpe>
</service>
<script id="ssh-hostkey"
output="1024 8d:60:f1:7c:ca:b7:3d:0a:d6:67:54:9d:69:d9:b9:dd (DSA)

2048 79:f8:09:ac:d4:e2:32:42:10:49:d3:bd:20:82:85:ec (RSA)"/>
</port>
<port protocol="tcp" portid="80">
<state state="open" reason="syn-ack" reason_ttl="53"/>
<service name="http" product="Apache httpd" version="2.2.14"
extrainfo="(Ubuntu)" method="probed" conf="10">
<cpe>cpe:/a:apache:http_server:2.2.14</cpe>
</service>
<script id="http-title" output="Go ahead and ScanMe!"/>
</port>
</ports>
<os>
<portused state="open" proto="tcp" portid="22"/>
<portused state="closed" proto="tcp" portid="1"/>
<portused state="closed" proto="udp" portid="31289"/>
<osclass type="general purpose" vendor="Linux" osfamily="Linux"
osgen="2.6.X" accuracy="100">
<cpe>cpe:/o:linux:linux_kernel:2.6.39</cpe>
</osclass>
<osmatch name="Linux 2.6.39" accuracy="100" line="39278"/>
</os>
<uptime seconds="23450" lastboot="Fri Sep 9 12:03:04 2011"/>
<distance value="11"/>
<tcpsequence index="199" difficulty="Good luck!"
values="49018209,48C3EBED,495A2E7F,493EF30C,48ED43B3,495A9B0C"/>
<ipidsequence class="All zeros" values="0,0,0,0,0,0"/>
<tcptssequence class="1000HZ"
values="165CC09,165CC6E,165CCD2,165CD36,165CD9A,165CE48"/>
<trace port="256" proto="tcp">
<!-- Several hop elements removed for brevity -->
<hop ttl="9" ipaddr="72.52.92.109" rtt="15.69" host="10gigabitethernet1-1.core1.fmt1.he.net"/>
<hop ttl="10" ipaddr="64.62.250.6" rtt="12.06" host="linode-llc.10gigabitethernet2-3.core1.fmt1.he.net"/>
<hop ttl="11" ipaddr="74.207.244.221" rtt="16.55" host="li86-221.members.linode.com"/>
</trace>
<times srtt="26517" rttvar="19989" to="106473"/>
</host>
<runstats>
<finished time="1315618434" timestr="Fri Sep 9 18:33:54 2011" elapsed="13.66"
summary="Nmap done at Fri Sep 9 18:33:54 2011; 1 IP address (1 host up)
scanned in 13.66 seconds" exit="success"/>
<hosts up="1" down="0" total="1"/>
</runstats>
</nmaprun>
Another advantage of XML is that its verbose nature makes it easier to read and understand than other formats. Readers familiar with Nmap in general can likely understand most of the XML output in Example 13.9, “An example of Nmap XML output” without further documentation. The grepable output format, on the other hand, is tough to decipher without its own reference guide.
There are a few aspects of the example XML output which may not
be self-explanatory. For example, look at the two
port
elements in Example 13.10
<port protocol="tcp" portid="22"> <state state="open" reason="syn-ack" reason_ttl="56"/> <service name="ssh" product="OpenSSH" version="4.3" extrainfo="protocol 2.0" method="probed" conf="10"/> <script id="ssh-hostkey" output="1024 60:ac:4d:51:b1:cd:85:09:12:16:92:76:1d:5d:27:6e (DSA)
 2048 2c:22:75:60:4b:c3:3b:18:a2:97:2c:96:7e:28:dc:dd (RSA)"/> </port> <port protocol="tcp" portid="113"> <state state="closed" reason="reset" reason_ttl="56"/> <service name="auth" method="table" conf="3"/> </port>
The port protocol, ID (port number), state, and service name are the
same as would be shown in the interactive output port table. The
service product
, version
, and extrainfo
attributes come from version detection
and are combined together into one field of the interactive output
port table. The method
and conf
attributes aren't present in any other output types. The method can
be table
, meaning the service name was simply
looked up in nmap-services
based on the port
number and protocol, or it can be probed
, meaning
that it was determined through the version detection system. The
conf
attribute measures the confidence Nmap has
that the service name is
correct.
The values range from one (least
confident) to ten. Nmap only has a confidence level of 3 for
ports determined by table lookup, while it is highly confident (level
10) that port 22 of Example 13.10, “Nmap XML port elements” is OpenSSH, because Nmap connected to the port and found an SSH
server identifying as OpenSSH.
One other aspect that some users find confusing is that the
attributes /nmaprun/@start
and /nmaprun/runstats/finished/@time
hold timestamps given in
Unix time, the number of seconds since
January 1, 1970.
This is often
easier for programs to handle. For the convenience of human readers,
versions 3.78 and newer include the equivalent calendar time written
out in the attributes /nmaprun/@startstr
and
/nmaprun/runstats/finished/@endstr
.
The original command line
(argv
array) is stored in the attribute
/nmaprun/@args
. Arguments are separated by
whitespace. Arguments that originally contained whitespace are enclosed
in double
quotes
(which appear as "
in the XML). Individual
characters can also be
escaped
with backslashes within quoted strings.
Nmap includes a document type definition (DTD) which allows XML parsers to validate Nmap XML output. While it is primarily intended for programmatic use, it can also help humans interpret Nmap XML output. The DTD defines the legal elements of the format, and often enumerates the attributes and values they can take on. It is reproduced in Appendix A, Nmap XML Output DTD.
Using XML Output
The Nmap XML format can be used in many powerful ways, though few users actually take any advantage of it. I believe this is due to inexperience of many users with XML, combined with a lack of practical, solution-oriented documentation on using the Nmap XML format. This chapter provides several practical examples, including the section called “Manipulating XML Output with Perl”, the section called “Output to a Database”, and the section called “Creating HTML Reports”.
A key advantage of XML is that you do not need to write your own parser as you do for specialized Nmap output types such as grepable and interactive output. Any general XML parser should do.
Nmap XML output can of course be viewed in any text editor or XML editor. Some spreadsheet programs, including Microsoft Excel, are able to import Nmap XML data directly for viewing. These general-purpose XML processors share the limitation that they treat Nmap XML generically, just like any other XML file. They don't understand the relative importance of elements, nor how to organize the data for a more useful presentation. The use of specialized XML processors that make sense of Nmap XML output is the subject of the following sections.