Pages

Wednesday, February 12, 2014

Fedora/Hydra/Blacklight on SmartOS (Part 2; Installing SmartOS, ZFS & SMF configuration)

Installation of SmartOS

SmartOS is a unique operating system provided by Joyent Inc.. Based on IllumOS its primary goals is to provide a bare metal VM manager coupled with speed, scaling & redundancy.

First things first, we need to obtain the latest USB image (I tried to use the DVD/CD disk option and a USB key was easier to work with due to space constraints within /).

Now we need to create a bootable USB key from the image we just downloaded. Joyent is kind enough to provide very useful documentation on how to accomplish this.

Installation will ask a couple of questions about networking & which disk to use a within your zpool configuration. My recommendation (if you have more than one disk) is to only use the first disk as we will be providing a more robust and optimal zpool configuration after our installation.

ZFS Configuration

At this point you should have a working SmartOS installation with one disk configured as a zpool. This example of configuring the zfs zpool is to allow for the following options:

  1. Primary raidz3-0 (zone)
  2. Hot swappable drives
  3. Dedicated mirrored caching drives (SSD disk types for performance)
  4. Dedicated mirror disks of ZIL block reads/writes

This example does involve a system that utilizes quite a few physical drives, both mechanical & SSD types. SSD is used for the caching drive only as it is to assist in performance of commonly accessed data.

Fig. 1 shows the command used to assemble the zpool as exampled in Fig. 2.

Fig. 1

$ zpool create zones raidz3 c0t0d0 c0t1d0 c0t2d0 c0t3d0 spare c1t0d0 c1t1d0 cache mirror c2t0d0 c2t1d0 log mirror c3t0d0 c3t1d0

Fig. 2

$ zpool status
  pool: zones
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        zones       ONLINE       0     0     0
          raidz3-0  ONLINE       0     0     0
            c0t0d0  ONLINE       0     0     0
            c0t1d0  ONLINE       0     0     0
            c0t2d0  ONLINE       0     0     0
            c0t3d0  ONLINE       0     0     0
        spares
          c1t0d0    AVAIL       0     0     0
          c1t1d0    AVAIL       0     0     0
        cache
          mirror-0
            c2t0d0    ONLINE       0     0     0
            c2t1d0    ONLINE       0     0     0
        logs
          mirror-1
            c3t0d0    ONLINE       0     0     0
            c3t1d0    ONLINE       0     0     0

errors: No known data errors

For more information on managing zpool's I recommend the documentation as it gives you the necessary information on various topics such as swapping out bad primary disks for disks that were designated as hot swappable spares, etc.

SMF Configuration

Next up is to tune the startup services and get the SmartOS ready for our Fedora/Hydra/Blacklight VM instances.

Because SmartOS runs entirely in memory any configuration options are to be placed in the default /opt/custom/smf which is mounted within the OS from the zpool configuration. SmartOS provides some documentation on this here.

I disabled quite a few services that are enabled by default as for the host OS they will not be necessary. See Fig 3. for contents of /opt/custom/smf and Fig 4. for a sample configuration file used to execute the svcadm command to shutdown services.

Fig. 3

$ ls -lah 
total 58
drwxr-xr-x   2 root     root          11 Feb 11 14:27 .
drwxr-xr-x   6 root     root           7 Feb 11 18:40 ..
-rw-r--r--   1 root     root        1.3K Feb  7 15:30 bash-history-link.xml
-rw-r--r--   1 root     root        1.1K Feb  7 15:46 disable-bind.xml
-rw-r--r--   1 root     root        1.1K Feb  7 15:46 disable-inetd.xml
-rw-r--r--   1 root     root        1.1K Feb  7 15:46 disable-multicast.xml
-rw-r--r--   1 root     root        1.1K Feb  7 15:46 disable-ntp.xml
-rw-r--r--   1 root     root        1.1K Feb 11 13:18 disable-sendmail-client.xml
-rw-r--r--   1 root     root        1.1K Feb 11 13:19 disable-sendmail.xml
-rw-r--r--   1 root     root        1.1K Feb 11 14:29 enable-ipf.xml
-rw-r--r--   1 root     root        1.3K Feb  7 15:29 inputrc-link.xml

Fig 4.

$ cat disable-sendmail.xml
<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle type='manifest' name='export'>
  <service name='global/disable-sendmail' type='service' version='0'>
    <create_default_instance enabled='true'/>
    <single_instance/>
    <dependency name='network' grouping='require_all' restart_on='error' type='service'>
      <service_fmri value='svc:/milestone/network:default'/>
    </dependency>
    <method_context/>
    <exec_method name='start' type='method' exec='svcadm disable sendmail' timeout_seconds='60'/>
    <exec_method name='stop' type='method' exec='svcadm enable sendmail' timeout_seconds='60'/>
    <property_group name='startd' type='framework'>
      <propval name='duration' type='astring' value='transient'/>
      <propval name='ignore_error' type='astring' value='core,signal'/>
    </property_group>
    <property_group name='application' type='application'/>
    <stability value='Evolving'/>
    <template>
      <common_name>
        <loctext xml:lang='C'>Disable sendmail services</loctext>
      </common_name>
    </template>
  </service>
</service_bundle>

Next we will create the necessary networking environment for our VM's and begin to setup our Fedora/Hydra/Blacklight stack

Monday, February 10, 2014

Fedora/Hydra/Blacklight on SmartOS (Part 1; Background, Research & Planning)

Software Background

ContentDM is a digital collection asset management software used by a majority of libraries. For a complete list of features etc. please take a look at their informational page.

An alternative to using ContentDM is to use a combination of FOSS projects:

  1. Fedora Commons: Digital content management
  2. Solr: Enterprise search platform (this will require a separate server)
  3. Project Blacklight: Discovery interface for Solr
  4. The Hydra Project: Front end component

Planning & Research

My research on the infrastructure used for these alternatives has pointed towards a simple operating system architecture of any number of linux distributions.

Currently our ContentDM infrastructure was given a five year life cycle. That was ten years ago. And due to the large digital collection puts the current system handling approximately ten terabytes. Currently our digital collection consists of a very large image and document library of various formats.

While ten terabytes today is not much, we also have close to sixty terabytes of video objects we wish to integrate.

That would bring the current project storage to a whopping seventy terabytes. The lifecycle for the project is again to be five years, but due to the amount of objects the solution will need to scale exponentially.

From a purely dev-ops perspective the traditional route is to implement a SAN, place a load balancer in front of a group machines, keep them synced, backed up and monitored and your up and running.

This project however will use a different strategy; implement a SAN, use a load balancer, fire up Smart-OS, use a KVM/Zone per application (Fedora/Hydra/Solr/Blacklight), use ACL's per VM instance, point all storage for Fedora (KVM/Zones) to the SAN, keep them synced, backed up and monitored.

The infrastructure

The planned environment:

  1. Scalability for physical hardware:
    • HAProxy can be used in front of any physical Smart-OS VM server and physical Solr servers, see for details
  2. Scalability for KVM/Zones:
    • Fedora VM instances can be added to each server instances quickly as the need arises and can use an NFS share on the SAN
    • Hydra VM instances can also be added to each server instance as new hardware is added to each server (memory, disks etc)
    • Blacklight VM instances can also be spun up when the need arises
  3. Security:
    • Fedora/Blacklight VM instances can be restricted to host based ACL's for internal VM networking only
    • Hydra VM instances can be publicly accessed per each Smart-OS host installation creating an internal DMZ per server

Thursday, January 9, 2014

Speeding up nmap with node.js

The tool; nmap

When it comes to network host discovery, enumerating details on remote hosts such operating system, device type and open ports nmap is the tool the majority of sys admins, hackers, pen testers & script kiddies turn to first.

I consider it a great piece of software, however I also find it painfully slow.

Performance

While nmap has many options for tuning the performance of a scan I thought it might be a worthy challenge to increase this. The current documentation has this to say about performance of nmap scans...

While Nmap utilizes parallelism and many advanced algorithms to accelerate these scans, the user has ultimate control over how Nmap runs.

Fair enough, lets see what we can do shall we?

node-libnmap

While far from a complete solution, the current version (v0.0.3) of node-libnmap dramatically speeds up scans.

It does this by creating equally distributed scan blocks based on the number of hosts per subnet range and the number of CPU cores on the machine. For example; the application generates a total set of blocks equal to 32 (8 core processor) for a machine residing on a network cidr of 10.0.2.0/24. Example:

nmap -sn -oG - 10.0.2.1-31
nmap -sn -oG - 10.0.2.33-63
nmap -sn -oG - 10.0.2.65-95
nmap -sn -oG - 10.0.2.97-127
nmap -sn -oG - 10.0.2.129-159
nmap -sn -oG - 10.0.2.161-191
nmap -sn -oG - 10.0.2.193-223
nmap -sn -oG - 10.0.2.225-255

The next step uses nodes thread model to create a new child process executing asynchronously (with the help of the async module) for every block generated.

benchmarks

Here are some preliminary benchmarks and samples

The results here are all coming from a virtual environment with limited system resources but should give an overall picture of performance of the scans. My VM environment is using 8 cores with 4 threads per core given a total returned from require('os').cpus.length = 32.

Nmap host discovery

$ time nmap -sn -oG - 10.0.2.0/24
# Nmap 5.51 scan initiated Wed Jan  8 18:54:07 2014 as: nmap -sn -oG - 10.0.2.0/24
Host: 10.0.2.2 ()       Status: Up
Host: 10.0.2.3 ()       Status: Up
Host: 10.0.2.15 ()      Status: Up
# Nmap done at Wed Jan  8 18:54:26 2014 -- 256 IP addresses (3 hosts up) scanned in 19.33 seconds

real    0m19.339s
user    0m0.052s
sys     0m0.080s

Nmap host discovery using node-libnmap

$ time node test/run.js 
{ adapter: 'eth0',
  properties: 
   { address: '10.0.2.15',
     netmask: '255.255.255.0',
     family: 'IPv4',
     mac: '52:54:00:12:34:56',
     internal: false,
     cidr: '10.0.2.0/24',
     hosts: 256,
     range: { start: '10.0.2.1', end: '10.0.2.254' } },
  neighbors: [ '10.0.2.2', '10.0.2.3', '10.0.2.15' ] }

real    0m3.323s
user    0m0.326s
sys     0m0.412s

And an example with multiple adapters on multiple 802.11q segments

$ time node test/run.js 
[ { adapter: 'eth0',
    properties: 
     { address: '10.0.2.15',
       netmask: '255.255.255.0',
       family: 'IPv4',
       mac: '52:54:00:12:34:56',
       internal: false,
       cidr: '10.0.2.0/24',
       hosts: 256,
       range: { start: '10.0.2.1', end: '10.0.2.254' } },
    neighbors: [ '10.0.2.2', '10.0.2.3', '10.0.2.15' ] },
  { adapter: 'eth1',
    properties: 
     { address: '192.168.2.15',
       netmask: '255.255.255.128',
       family: 'IPv4',
       mac: '52:54:00:12:34:57',
       internal: false,
       cidr: '192.168.2.0/25',
       hosts: 128,
       range: { start: '192.168.2.1', end: '192.168.2.254' } },
    neighbors: [ '192.168.2.2', '192.168.2.3', '192.168.2.15' ] } ]

real    0m3.447s
user    0m0.493s
sys     0m0.796s

Mileage may vary