20000808, 20020817  

Introduction
------------

The Test Traffic Traceroute Database is a project to store the 
results of the TTM traceroute measurements in a RDBMS. It replaces
an earler implemenation with ASCII files (VBYE/RBYT).
The DB and accompanying programs have been optimised for 
fast performance of the daily TTM data processing, which needs
to find matching routes for each delay measurement.

The project has two main parts:

- The updating part consists of reading the RVEC files provided by the test 
  boxes and storing them in the Database.
- The querying part allows people to query the database for specific routes
  at specific timestamps.

Original design:   	   Manuel Valente, RIPE NCC Software group 
Revised for AS extensions: Rene Wilhelm,   RIPE NCC New Projects group

Inputs/outputs/External Progs and Files used
--------------------------------------------

Deliverables:

initSQL		 - SQL instructions to initialize the database
updatedb	 - update the Database
routequery	 - a backend for queries via CGI
testquery	 - test the DB query performance
libtraceroutdb.a - a library with utility functions, provides
		   query functions to other programs.
fillrecords,
fillroutes 	 - tools to transfer old ASCII data to SQL DB
transferdb	 - transfer data to Old_* tables from which they
		   eventually can be purged

Input:

The update program takes as input a file listing the data files retrieved
from remote testboxes by the central TTM processing machines. Of these
files, only those whose name starts with RVEC (traceRoute VECtors)
are opened for processing.

The testquery program takes input from an SNDP (SeND Packets) file
to test the performance of the query part.
- SNDP files provided by the test boxes, which contain timing information.
- RVYE and RBYT files used to fill the DB from ASCII files.

We also use the LIST_OF_TESTBOXES file, which maps a boxname to an identifier.

Output: - the output of updatedb is sent to the MySQL database.
	- a CGI interface is provided for querying the DB

OS to run on, memory/disk space/speed limitation 
------------------------------------------------

OS: Solaris 2.6

The updating program is very CPU-intensive, it also may use large amounts of 
memory after running for a long time.

A large amount of disk space will also be required to store the contents of 
the DB.

Details:
------

1 - Structure of the Database:

Ranges:

Each box is mapped to an identifier taken from the LIST_OF_TESTBOXES file.
id      name            IP
1       tt01.ripe.net   193.0.0.4
2       osdorp.ripe.net 193.0.0.2

As the input files don't mention names, only IP numbers, a hash table
mapping IPs to IDs internal to the program is enough; no need for a
table.

Routes:
CREATE TABLE Routes (
  id        int  (11) NOT NULL auto_increment,
  len       int   (1) NOT NULL,
  crc       char (32) NOT NULL, 
  route     TEXT,
  PRIMARY KEY (id)
);

id  len   crc         route
43  12    ow4kjke/ee  195.114.80.2 193.112.33.44 ...

The Routes table contains all the routes that were seen at least once. 
It is read at initialization and stored in a hash indexed by the MD5 CRC.

Len is the number of hops for a given route.

crc is a unique identifier used to compare routes rapidly.

Route is the route itself, a string composed of IPs in integer format.

ASpaths:
CREATE TABLE ASpaths (
  id        int  (11) NOT NULL auto_increment,
  len       tinyint (3) unsigned,
  crc       path (32) NOT NULL, 
  route     TEXT,
  PRIMARY KEY (id)
);

The ASpaths table contains a one to one translation of the IPs
observed in a route to the originating AS (autonomous system) 
numbers. Because IPs can move from one AS to another (mergers,
take overs of ISPs), we cannot exclusively link one ASpath
to one Route; instead of adding a column to the routes table
we have to maintain a seperate table with almost identical 
structure. As discussed below, the Records table will provide
the link of which ASpath and which Route were seen at a particular
time. If no AS can be found for a hop, a 0 is stored in the path.

In principle, the length (#hops) of the ASpath is identical
to that of the Route vector; however, there's on exception in 
the production database: the first entry in the table has
a lenght of 0, and an empty path, to accomodate all Records
that were entered in the DB before the ASpath information
was added.

NOTE: as the stored path is merely a mapping of the observerd
hops, one AS will appear as many times as IP numbers of the
AS were seen in the traceroute. Condensing this to a sequence
of unique hops in AS space is a task for the application programs.

Records:
CREATE TABLE Records (
  id        int  (11) NOT NULL auto_increment,
  src       int  (11) NOT NULL,
  dst       int  (11) NOT NULL,
  routeid   int  (11) NOT NULL, 
  tstart    int  (11) NOT NULL,
  tend      int  (11) NOT NULL,
  numrec    int  (11) NOT NULL,
  aspathid  int  (11) NOT NULL, 
  PRIMARY KEY (id)
); 

id   src  dst  routeid      tstart        tend   numrec
2129   4   47      149   957219451   957219451        1 

The Records table is a compressed form of the raw data from RVEC files. 
It contains the id of a range, the id of a route, the first and last 
timestamp at which a given route was seen, and the number of occurences 
of that route.

This table is filled during the execution of the updating script - all 
data is kept in memory as long as the records show the rangeid and routeid 
to be identical. When they change, a record is written to the DB.

src is the id of the source box, taken from LIST_OF_TESTBOXES
dst is the id of the target box, taken from LIST_OF_TESTBOXES

routeid is the id of the route, from the Routes table.
aspathid is the id of the ASpath, the mapping of IPs to ASnum
at the time the data were processed.

tstart and tend are the timestamps recorded at the first and last occurence
of this record.
numrec is the number of occurences of this particular record.

2 - Components:

There are several executable programs used in the project:

* updatedb
allows the daily updating of the database from the RVEC files.

Upon initialization, updatedb reads information on ranges (src-dst) from
LIST_OF_TESTBOXES, routes from the Routes table and ASpaths from the
ASpaths tables; Three hashes are built with this information (see
get_ranges() and get_table()). In addition a fourth hash is created
to map IPs to ASnums; this one will be filled while we process data.

Updatedb accepts one parameter, a filename. This filename should contain
a list of RVEC files. These files can be in plain text or compressed (.gz) 
format.

For each of these files, we do:
Read the contents of the file, and for each line, create a recstruct record
to store the contents of the line (process_file()). If we discover a new 
route, we insert it in the Routes table; similar logic applies to ASpaths

Then, we take the linked list of recstruct records that we got, and we sort 
them by src, dst, and timestamp (sort_records()).

Then we process that list of records (process_records()). 

We keep a fullrec variable to store the 'ongoing' records.

For each record:
If the record has the same src, dst, routeid and aspahthid than fullrec,
just increment the number of records in fullrec and modify the t_end to
the new record timestamp.

If not, write fullrec to the Records table and store the new record in fullrec.

If fullrec is empty, try to see if the last written record in the Records table
with the same src and dst has the same routeid and aspathid. If so, delete that
record, store the data in fullrec, and update fullrec with the new record.


* fillroutes and fillrecords
These two applications are used to fill the Routes and Records tables with data 
taken from ASCII files - VBYE for routes, RBYT for records.

The two applications read the file name from the command-line, and process one 
line at a time, filling the DB with a record for each line.

They need only to be executed once, in order to put the DB in an up-to-date 
state.

* routequery

A CGI interface must also be provided for querying the database from the web.
In order to do that, we used a two-tier system: the back-end of the CGI 
remained on one machine, while the front-end was put on the http 
server. To get the results, the front-end CGI calls a well-known port on the 
machine hosting the back-end, passes a line containing src, dst, t_start and 
t_end, and reads all the lines returned, until it finds a line containing a 
single '.'.

The results are then reformatted by the front-end CGI and displayed on the 
web browser.

The back-end (routequery) performs the following operations:
Retrieve all routes between t_start and t_end, plus the last route starting 
before t_start, if it falls within a given time limit.

* transferdb

This application is used to move the old records and routes from the database 
to alternated tables.

First, we look for every record older than a provided time range. These
records are deleted from the Records table and added to Old_Records. All
routes that do not appear any more in Records are removed from Routes and
added to Old_routes (remove_routes()).

Finally, all records in Old_Records older than a certain time range are 
removed from Old_records.

* testquery

This is the actual querying part. It is used to query which routeid was
used between two boxes at a given timestamp.

Querydb.c contains the actual query function, match_vector(), which was 
specified by the TT group.

testquery takes a filename as a command-line parameter. This filename, as for
updatedb, should contain a list of plain text or compressed SNDP files.

First, all data for a given day, with a given source box, is fetched from the
Records table and stored in a hash (see build_data()).

The queries are read from the SNDP files, which contain queries for one day, 
from a given box. Every line of the file is read and processed by 
match_vector(), using the hash as data source. 

After the whole file has been processed, the hash is cleared using 
reset_data().

3 - Comments about performance:

The performance of the scripts has been found to be acceptable but real 
performance tests still have to be made.

4 - Other Remarks:

* CRC Calculation:

From the MD5 documentation (RFC 1321):
----
The algorithm takes as input a message of arbitrary length and produces
as output a 128-bit "fingerprint" or "message digest" of the input.
It is conjectured that it is computationally infeasible to produce
two messages having the same message digest, or to produce any
message having a given prespecified target message digest.
----

A 128-bit string can have 2^128 possibilities.

Since the project generates 2.4 million routes per day, assuming that each
route is unique (to make things harder), we may have a maximum of
2400000 * 365 = 876000000 unique routes in one year.

So, the probability of having two different routes with the same crc is:
876000000 / 2^128 = .00000000000000000000000000000257

Or one chance in 3.88E30.

Since most routes appear several times, that probability is actually much 
lower.

* Dumping/Backup of the database:

The mysqldump tool should be used to dump the database - the syntax is
described in the documentation of MySQL, but a simple command line can be:

mysqldump [OPTIONS] database [tables] > Textfile