Large scale projects generate massive amounts of data. The data is transmitted through switches and routers between various scientific organizations and Universities often times over trans-continental links. Any issues like congestion, link break down, DDoS attack etc. need to be identified quickly and fixed as these can result in delay and/or loss of data in an environment where timely transfer of data is of the essence. Instrumentation and measurement frameworks like perfSONAR provide users the ability to query and extract current and historic network statistics like one way delay, throughput, bandwidth, jitter etc. Automated techniques have been developed previously which focus on leveraging network data from frameworks like perfSONAR, NLANR AMP etc. to identify anomalies. Some of the methodologies that have been adopted previously include PCA, plateau detection, Kalman filter etc. Most of these methodologies suffer from one of three major drawbacks. They are either not suitable for online analysis or suffer in performance due to a high number of false positives or the time for detection of anomalies is in the order of days. This is a reinforcement learning algorithm used for real time identification of anomalies. The algorithm is a slight variaiton of the methodology proposed in the following paper:
Calyam Prasad, Pu Jialu, Mandrawa Weiping, and Ashok Krishnamurthy. “OnTimeDetect: Dynamic Network Anomaly Notification in perfSONAR Deployments”. In Annual IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, 2010.