The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing
The Network Weather Service is designed to provide accurate forecasts of dynamically changing performance characteristics from a distributed set of metacomputing resources. It's implemented on Unix and TCP/IP sockets. Current NWS consists four different component processes: Persistent State process, Name Server process, Sensor process and Forecaster process. NWS has NWS sensor which gather and store time-stamp performance measurement pairs for a specific resource, and CPU sensor provide measurements of CPU availability on timeshared Unix systems. The NWS exports a lightweight and portable C API that contacts the system via sockets so that applications can quickly retrieve short term performance forecasts.
Dynamically Forecasting Network Performance Using the Network Weather Service
To generate a forecast, the NWS operates several different models simultaneously
and computes a forecast from each. The forecasting methods of NWS include mean-based
methods, median-based methods and autoregressive methods. This paper shows the
detail discussion of these methods and performance comparison.
Synchronizing Network Probes to avoid Measurement Intrusiveness with the Network Weather Service
The NWS uses TCP to send small, fixed-size probes measuring in the kilobytes
with a frequency ranges from tens of seconds to several minutes,and then applies
fast statistical models to probe histories to make performance forecasts. To
avoide probe collisions as much as possible, the NWS uses a token-protocol.
This paper shows the further discussion about this token-protocol.
The Architecture of Coral Reef: An Internet Traffic Monitoring Software Suite
The CoralReef architecture is organized primarily into two "stacks" of software components: "raw traffic stack", which deals directly with individual PDUs (packets or cells), read from a network traffic stream, and "flows stack" which deals with trac data that have been aggregated into "flow intervals". The core of the the raw traffic stack is the libcoral C library, which provides a consistent API for capturing trac from specialized ATM and POS capture cards from multiple vendors and pcap interfaces. One of the design goal is to provide the same API in C, C++, and Perl. The flows stack includes modules for storage and manipulation of tables of frequently collected aggregate data.
CoralReef software suite as a tool for system and network administrators
CoralReef, designed by CAIDA, is a convient set of passive data tools which can provide a consistent interface for a wide range of network analysis applications, including raw capture, flows analysis and real-time report generation. CoralReef runs on Unix, and supports device independent access to network data from OCXmon hardware, native OS network interfaces, and trace files; programming APIs; a variety of bundled analysis applications; and greater flexibility in remote access and administration. The unique feature of CoralReef is that supports a large number of features at many layers, and provides APIs and hooks at every layer, making it easier for anyone to apply it in unanticipated ways and develop new applications with minimum duplicated effort. Its core library libpcap is based on part of tcpdump.
Measuring the Immeasurable: Global Internet Measurement Infrastructure
This is a survey of existing public and mission-specfic Internet measurement
infrastructures, including CoralReef, IEPM, I2(Abilene), NIMI, NLANR AMP, NLANR
PMA, NPACI NWS,etc. The comparisons of various criteria are very comprehensive.
A System for Flexible Network Performance Measurement
National Internet Measurement Infrastructure (NIMI) is a software system for
building network measurement infrastructures. A key NIMI design goal is scalability
to potentially thousands of NIMI probes within a single infrastructure. Security
is a key concern of the design. By Using measurement "module", NIMI
probes can reflect various different measurement tools, include traceroute,
TReno, mtrace, and zing. However, this measurement flexibility only limited
on the integration of simple measurement tools as its probe. In contrast to
our design of Resource monitor, we intend to provide common interface on integration
of different measurement infrastructure or architecture. But we might still
be interested in using NIMI's design idea: the stand-alone third-party measurement
software modules can be "plugged in" to NIMI and are executed by "exec()"
calls. And one thing worth mention is that NIMI infrastructure consists of Measurement
Client(MC), Data Analysis Client(DAC) and NIMI probe, to which we designed Resource
monitor in a very similar way. And NIMI is being used by two different projects:
MINC and Web100.
Passively Monitoring Networks at Gigabit Speeds Using Commodity Hardware and Open Source Software
Nowadays people cannot use the same traffic monitoring tools that they use
three years ago, such as MRTG polling traffic information out of network routers/switches
interfaces via SNMP MIB-II variables, to monitor the gigabit networks. The author
provides a new passive monitoring architecture special for gigabit networks.
The solution is to use Juniper Switch as the hardware connect to PC which runs
a new monitoring software-ntop. Ntop performs several measurements including:RMON-like
measurements, NetFlow-like measurements and Security measurements. The Juniper
is used for accounting "easy to count" traffic(e.g. traffic volume),
mirroring and filtering traffic (e.g. discard uninteresting packets), whereas
the PC is deployed for complex accounting. In order to overcome performance
bottleneck, the author uses a two-layer architecture based on a preprocessor
plus the existing monitoring application. The simplest preprocessor is a traffic
sampler that discards packets according to a specific rate similar to what sFlow-based
probes do. Juniper instrumentation is performed transparently by libpcap by
means of JUNOScript.
The NLANR Network Analysis Infrastructure
The National Laboratory for Applied Network Research (NLANR) is developing a Network analysis Infrastructure (NAI) which is mainly focus on passive measurement, active measurement, SNMP and BGP data analysis, and presenting the results of analysis to the HPC community. Fortunately, we have a very similar goal for our Resource Monitor to NAI. Thus, our design might be benefited from NAI design.
The NLANR NAI infrastructure supports all three types of network monitoring: passive measurement, active measurement and control monitoring. It also includes a number of projects:
-Cichlid have been developed for large volume data representation and analysis.
-The NLANR passive measurement project utilizes OCXmon and FDDI monitors. Its software CoralReef include many data analysis program, such as ows analysis, packet size and frequency histograms, packet size run lengths, protocol and port break down, host and autonomous system matrices, type of service breakdown and ASCII dumping of packets.
-AMP is NLANR's active measurement project. The focus is on making site to
site measurements of round trip time (RTT), packet loss, topology and throughput
across the National Science Foundation (NSF) approved HPC networks.The monitors
sends ICMP packet to each other to record time and the route to each other monitor
is recorded using traceroute. AMP also uses an active mirror system to keep
the monitoring data updated in real-time interval. In this project, Automatic
Event Detection would be useful to allow an interested party to register to
receive notication of interesting events on some or all paths.
Network performance visualization: insight through animation
The National Laboratory for Applied Network Research (NLANR) has a number of network measurement projects that produce gigabytes of data and thousands of Web pages of graphs and summaries each day. These include the OCxMON Passive Monitoring and Analysis project (PMA), the Active Monitoring Project (AMP), and the collection and analysis of SNMP and BGP data. In order to avoid overlooking important network event, NLANR has developed the Cichlid tool for visualizing and animating real-time data sets in three dimensions.
Cichlid is free C Code, using the OpenGL and GLUT graphics libraries, portable, and is currently being used on the FreeBSD,Linux, Microsoft Windows, and IRIX platforms.
Cichlid supports graphs that show matrices of network delays between pairs of sites, traffic distribution over address blocks, traffic volume by protocol and source, latencies from a single site to others, as well as network evolution over time.
The good thing that we can learn from Cichlid is that Cichlid provides an entire
visualization infrastructure, including the rendering and data transfer functionalities.
We have very similar design of function modules for Resource Monitor to Cichlid's
design of its three primary functions--(1)Abstraction and Modeling,(2)Collection
and Distribution,and (3)Visualization. How to abstract data using a good model
is one of the keys in the design. Here, Cichlid uses DataSet to abstract and
transfer data from server to client. However, it doesn't mention the abstraction
policy using Dataset in this paper. Dataset abstraction model is a good design
approach that we might want to try. All of the data transport in Cichlid is
implemented on top of TCP, however, the data-independence restriction prohibits
Cichlid from being applied on UDP.
Integrating Active Methods and Flow Meters - an implementation using NeTraMet
The goal of this paper is to integrate active methods and performance metrics into flow meter NeTraMet to obtain richer functionality. The approach is to use two monitoring systems which consist of passive as well as active components: traffic meters that identify the traffic flows and keep up counters, and monitoring packets sent periodically or according to a statistical distribution.
Developing a Web100 Based Network Diagnostic Tool (NDT)
Network Diagnostic Tool (NDT) is developed to perform this basic investigative function to Find and fix a network performance or configuration problem using a Web100 based approach. By combining the Web100 data with measured TCP throughput data NDT can identify:Duplex Mismatch, Hardware Fault, Faulty Bandwidth Estimation, Network Link Speed, Full or Half Duplex link and Congestion. A command line version of the web based NDT is developed based on NLANR Iperf tool.