My Story…
Throughout my life I have been pushing the boundaries of technology and a little on the artistic/music side of things.
Let the Story Begin…
I began my career in 1988 as a analytical/environmental chemist looking to apply computers, data collection, and analytics to chemistry. Data Science and analysis have always been apart of each stage of my career. My first stage as a developer started in 1991 working on embedded systems and lead architect for a message-based multi-processing system for automated mail sorting. In my next stage I moved into more distributed architectures and agent technology with Fidelity and then Sonalysts. During this stage I created a basement lab to begin to explore network security. I moved more into network security in 2004 by starting my own group within Sonalysts called Guardian Services. I created a new technology focused on aggregate behavior analysis in 2006 and won a contract with the Dept. of Homeland security. I then wrote and received my first patent in 2013 based on the work I did for them. I moved into cloud technologies, private first, then public cloud later, in 2018 for Secureworks, Inc. For one of my projects, I created and then implemented a annotation capability used to label threat security events.
Leveraging Open Source Technology to Startup and Establish a Disruptive Cyber Defense Capability
In 2006, with an initial $1M seed capital (tons of sweat equity), and an initial part-time team of 2, eventually expanding to 15 diverse individuals, we were able to produce a tool, that existing MSSPs, and cyber defense tooling couldn’t replicate, Occulex.
This team was comprised of multiple public and private institutions, that we shared data and ideas with on: the need to establish a threat-focused Ontology (Today we have Mitre Att&ck), near-realtime risk, trust models (NATO paper), establishing moving target defense strategies based on device trust. Towards then end we started to conceptualize fusing cyber and physical domains, 2010, creating a journal article and presenting. Today that is known as IT/OT fusion.
This is an overview of that journey, and the lessons learned can be applied to a diverse set of industry sectors: LLM, Cyber Security, Healthcare, BioMedical, Genetics, etc… . .
Things started in a Small Lab in My Basement in the 90s
I began experimenting with open source in the mid to late 90s (e.g. Slackware, OpenBSD, PF, openpce, Snort, etc.) when I created a small network security lab in my basement. As time progressed, I installed honeypots, network flow analysis tools (US CERT SIlk), pen-testing tools (metasploit, nmap), network taps, NIDS (Snort, later Suricata and Bro), firewalls with OpenBSD then pfsense. I really got to understand the data types, current threat TTPs, and sensor bias and fidelity.
That knowledge allowed me cultivate the idea of how to fuse cyber telemetry, in a completely different fashion.
Evolved the Lab to Focus on the Development of Aggregate Behavior Analysis
By the 2006, that little lab evolved into (created by 2 people, a colleague and myself) an ingestion capability (driven by an Endace DAG), event streaming pipeline, transformation software (250k lines of C++ code, with OpenPCE), and HPC support to foster the development of ML techniques on the fused data sets. R was used for initial investigations, before we attempted migrating algorithms to the OpenMPI cluster. Nowadays, this rack of servers will/could be in the cloud, (if in AWS then, RDS, S3, MSK, EKS, Sagemaker, Jupyter, Spark will be used).
Created a 3D Visualization Tool to Analyze Aggregate Data and Drill into Raw Data
In 2010, the data produced from the pipeline was so different and complex, that we created a immersive 3D visualization tool in Java and OpenGL (~50k worth of Java UI code Model View Controller style architecture). We integrated MaxMind, for IP geolocation and could aggregate up to Autonomous Systems (AS). This tool allowed analysts to interact, zoom in, and highlight device behaviors, and to drill into the raw flow data. Someone called it a Cyber MRI, and later we called the platform Occulex. Lately, I have researched the use of Unreal Engine v5, to create immersive collaborative analytic environments.
The team was able to pick out emergent behaviors, qualify normal network behaviors, and share actionable intel with local and federal agencies. The approach logically compressed network flow data 1000x times, based on aggregation time window selection.
Our work gave me opportunities to speak at: NATO’s CCDCEO, to talk about the technique, lectures at MIT, an invite to Obama’s National Cyber Leap Year where I was part of the nature inspired defense Team, presentations in Predictive Analytics Symposiums, talks in DC to Dept. of Homeland Security and other agencies, invites to SRI to build out startup knowledge on value propositions, and elevator pitches.
The overall model, and architecture was established in 2006, yet by layering the processing of data, feeding to ML models, it offers a deep level of eXplainable AI (XAI) perspectives, when analysts are trying make sense of higher level outcomes. Note, to be clear, much of the initial work was focused on the establishment of primitives in Layer 3.
That said, I dove into modeling the data, published my findings (Deriving Behavior Primitives fromAggregate Network Features using Support Vector Machines), and presenting them to NATO’s CCDCOE in Tallinn, Estonia. I was able derive behavior primitives using support vector machines, SVM, using subsets of the derived and rich feature space.
One common theme on this specific ABA approach, is that when you combine high and low fidelity sensor telemetry (see my blog on sensor bias here), e.g. combine network flow/process behaviors, with IDS events, you can “burn the haystack” to get the needle, depending on your approach in data transformations, aggregations, and fusion.
Establishing a Data Science Development Lifecycle Supporting ABA
Starting in 2006, my team and I needed to establish a data science lifecycle facilitating the development of a new type of cyber defense technology, aggregate behavior analysis, ABA. This lifecycle hinged on the unique transformation that was done on network flow, and later process data. This lifecycle included the establishment of ground-truth data sets, data aggregation/transformation step, data cleansing/normalization, feature selection, using machine learning classification techniques e.g. SVM, to identify abstract primitives, and then model evaluation.
Summary of How We Evolved Aggregate Behavioral Analysis (ABA)
So it was in the summer of 2006, while on vacation with my family, I received a phone call from a very good friend at work, Jane Goldsmith, that I was awarded a SBIR grant through DHS. I was stunned, amazed, and my mind locked in on the challenge. The thesis for this idea was how do we gather behaviors from not only rule-driven data, e.g. firewalls, IPS/IDS, but also, form the underlying foundations of network communications, Network Flow data e.g. Netflow v9 (at the time). So, we can uncover “what we don’t know, we don’t know”. Also, use the derived data, fuse it with rule-based data, to “burn the haystack, to get closer to the needle.”.
During the proposal writing, I assembled a team that year based on public and private partnerships comprised of Universities (University of Connecticut, Dalhousie University), and risk-based think-tank (Delta-Risk), and a consultant from RedJack. Early on we assembled workshops to focus the concept on finding botnets. After we gathered data, performed some data science based investigation, we established a proof of concept. I put together a value proposition, later a business plan. I spent a summer writing a patent, got it approved and Sonalysts, Inc has established a multi-million dollar business based on it, evolving and alive today.
Also, before the onrush of cyber security, I delved into opensource network security systems in the late 1990s, establishing a small focused group of researchers at Sonalysts. After Sonalysts, in the fall of 2014, (I broke my ankle, and had some downtime before starting my next journey), I started to put ideas for a book focused on the concept of behavior attribution analytics, I have a way to go.
Cloud Development and the use of Ontology
I am now working with big fast data systems at Secreworks, developing in the cloud. As an architect I created a platform that first integration of Mitre Attack data being used to annotate security events to make them better understood by our clients, a kind of Ontology.
Embedded Systems Message-based Systems
In the early 1990s, my first software-related job experience, after being an environmental chemist, and studying software engineering, was with a startup working on a next generation intelligent mail sorting platform. I was their lead architect. This was an incredible experience, and an awesome team.
I started creating real-time cyber physical systems (CPS) back in the early 1990s, with mail sorting equipment. A cyberphysical system (CPS) is a computer system in which a mechanism is controlled or monitored by computer-based algorithms, and then in automating manufacturing with clothes cutting systems at Gerber Garment Technology. I started with distributed web-based applications rolling out Fidelity’s first online web-based trading systems in that time. Also, before the onrush of cyber security,
Opensource and a Simple Streaming Framework
I delved into opensource network security systems in the late 1990s, establishing a small focused group of researchers at Sonalysts. After Sonalysts, in the fall of 2014, (I broke my ankle, and had some downtime before starting my next journey), I started to put ideas for a book focused on the concept of behavior attribution analytics, I have a way to go.
In the mid-90s I started working on a simple event stream processing framework called the open pervasive computing environment (opce). It was written in C++. The goals was to create streaming capability that would support transformation and custom statistical computations. I open-sourced it in the early 2000’s. (https://github.com/mccuskero/openpce), (https://sourceforge.net/projects/open-pce/). The framework allowed for the creation of stream processors that fed off of sockets. Keep in mind, this type of processing is now done by frameworks like Kafka, and and Flink. The goal of the framework was to allow, from a consistent framework, to computationally acquire statistics and behaviors over not only server-based applications, but, also embedded systems. The capability would be leveraged later on as I pursued more advanced and complex solutions working for the Department of Homeland Security (DHS). Stream processing capabilities are a cornerstone enabling technology in developing semantic applications driven by unique data transformations.
Summary and Future Work
Knowledge-driven event streaming, platforms leverage multiple data sources, technology stacks, development strategies and capabilities. Anyway you look the problem, data is the driven force, data drives the solution.
I have been developing such systems since the early 1990’s and have seen the industries, technologies and markets slowly evolve to balance the needs with the capabilities. I became a thought leader and tech evangelist in the creation of an ontology-driven behavior analysis capabilities and their applications in cyber security.
Reach out to me sometime and I will tell you how I developed these techniques based on some core concepts using concepts evolved from omnipotent-based optical character recognition used in the early 90s.
Data Science is an evolving capability that escapes any single complete definition. Over the years to now, data science has grown from gathering and cleansing data sets to encompass data analysis, predictive analytics, data mining, machine learning and business intelligence.