ASO Board Study Terms

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

Statistical Models (Machine Learning)

Complex algorithms that receive data and output information about the data. Each algorithm has one or more HYPERPARAMETERS (mathematical values that determine how the algorithm interacts with the input data) SUPERVISED LEARNING: algorithm receives data with expected output (labels) and determines how to predict correct labels for new data. Classification: given attributes, outputs a label Regression: Given data, describes an underlying function that will assign a probability of a particular label to new data. UNSUPERVISED LEARNING: Algorithm receives data with no labels and attempts to determine relationships. Clustering: Finding groups of like items Association: Finding relationships between items or underlying correlation, as in principal component analysis and other forms of dimensionality reduction. SEMI-SUPERVISED LEARNING: a mix of labeled and unlabeled data REINFORCEMENT LEARNING: algorithm attempts to maximize a value (reward) through interactions with the environment or guessing labels often using neural networks.

Limitations and Implications of Analysis

Do I trust my data? - Is it complete? Am I reading my data correctly? - focuses on accuracy and understanding the fields in your data and ensuring you are interpreting them correctly. Assumptions? - avoid assumptions - verify everything, trust nothing.

Data Presentation

Focus on key aspects: READABILITY - Instead of "Subnet 27" display a descriptive name like "Sharepoint Servers" SCALE - Make sure pertinent portions of the data are highlighted MINIMALISM - Involves ensuring you are only representing the portions of the data that mean something to your analysis

Analytic Support Officer

Has a fundamental understanding of computer and data science and can apply basic concepts to solve complex problems involving small to medium size data sets. Capable of applying analytics to problems, identifying limitations, and requesting external support.

Network Analyst

Has basic knowledge of network protocols, theory of operations, header structure, and forensic value. Can install network sensor and verify operational status.

Big Data Platforms (BDP)

Gabriel Nimbus (Army) Caspian Pigeon (Air Force) Scarif (US CyberCOM) ACROPOLIS ( DISA) Data can be sent to BDP via enterprise scanning and endpoint tools such as ACAS, Tychon, or AESS or shipped via hard drives. it is then aggregated, filtered, and visualized. There is also a chat application that connects all of the GN community for unclassified collaboration. Threat Hub is a repository that has observable threat indicators and vulnerabilities based on the MITRE ATTACK Matrix.

MITRE's ATT&CK

Globally-accessible knowledge base of adversary tactics and tecniques based on real-world observation.

DATA VOLUME vs DATA VELOCITY

High volumes of data require large amounts of storage to retain. Network traffic with high velocity requires more CPU in order to ingest without dropping packets.

Regression Types

LINEAR - attempts to model the relationship between two variables by fitting a linear equation to observed data. One variable is considered to be an explanatory variable, and the other is considered to be a dependent variable. BAYESIAN LINEAR-used in machine learning to calculate regression coefficient values. LOGISTIC - used to compute the probability of mutually exclusive occurrences such as pass/fail, true/false, 0/1, and so forth QUANTILE - employed when the linear regression requirements are not met or when the data contains outliers. EXPONENTIAL - model that explains processes that experiences growth at a double rate.

Local vs Centralized vs Distributed Version Control

LOCAL - used to manage and track version of a file or groups of files on a local directory. CENTRALIZED - hosts the repository on a central server that development team accesses remotely. Users download it and make changes. DISTRIBUTED - each machine holds the entire repository and version history while a centralized server maintains the authoritative of the repository.

Standard Query Language (SQL)

Language for interrogating relational databases TABLES -basic building blocks of relational data bases * each table has a key but each table may only have one key. Common SQL Commands: SELECT UPDATE DELETE INSERT DROP - deletes a table completely

CCIR Decomposition

Once CCIRs have been decomposed into indicators, an ASO must determine what evidence is needed to answer each indicator.

Indicators for PIRs

Once the team has narrowed down the CCIRs, we determine the indicators which support those CCIRs (PIRs) by loolking at the tecniques beneath the tactic in the MITRE ATTACK Matrix NOTE: In the absence of credible intelligence with attribution to a particular threat entity, we select techniques that are both relatively common across threat groups and apply to the particular environment in which we are operating.

Area of Operations (AO)

Operational area defined by a commander for land and maritime forces that should be large enough to accomplish their missions and protect their forces (J_ 3-0)

PCAP vs LOG DATA

PCAP requires significantly more storage space than log collection from the network traffic such as Zeek logs. PCAP can give additional details about traffic. Common data Formats include JSON, CSV, and TSV. CSV and TSV useful to easily store and filter tables of data. JSON used widely by a variety of platforms.

Analytic Planning

PLAN, PREPARE, EXECUTE, ASSESS PLAN - Mission Analysis, Initial Recon, COA Development (Analytic Plan/ASOM) PREPARE - Assess & address capability gaps, Build Data Pipeline, Rehearse EXECUTE - Apply analysis & employ analytics, conceptualize results, Report results and recommend actions, ASSESS - Analyze & Refine Analysts Plan

Mission Variables

POLITICAL: Considerations of local, regional, and international government politics. Unofficial politics such as gangs can also be considered. MILITARY: ROE, exclusion zones (no-fly, maritime exclusions, ETC) Man-made obstacles do not fit into this group ECONOMIC: Considering adversary's financial systems, inflation rates, and things that factor into a nation's GDP SOCIAL: considering past wars, territorial disputes, civil or ethnic strife INFORMATION: Considerations for how the general populace receives information, as well as health/communicable disease INFRASTRUCTURE: Portable water, transportation means/systems, power and communication grids

Basic Statistics Definitions

PROBABILITY - numerical description of how likely an event is to occur or how likely it is that proposition is true. STATISTICS - Discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. POPULATION - set of all the things of interest SAMPLE - Subset of the population being measured STATISTIC - single measure of some attribute of a sample PARAMETER - statistic of the entire population COUNT - how many items in the sample meet a certain criteria MINIMUM - smallest item MAXIMUM - largest item

Evidence

Quantifiable output of an analytic. Each indicator or technique may have multiple pieces of evidence that support it.

RFS' Minimum Requirements

Requestor Mission Detailed description of capability gap Shortcomings of existing solutions Recommended solutions Impact if not provided Due

RFI's Minimum Requirements

Requestor Mission Detailed Description of information gap Impact - if not provided Due Date

Programming Control Structures

SEQUENCE: order in which code executes SELECTION: code which is executed ITERATION: repeating certain operations ITERATION TYPES: Recursion - While, For BASE CASE: Programmers would provide successively smaller versions of the list to the method until the list is empty.

Central Limit Theorem

The distribution of sample means approaches a Normal distribution, even when the actual population is non-normal

Analytic Support Officer

Two core functions: Analytic planning and analytic development Analytic Planning can be either deliberate or hasty Key output is the identification of data coverage, visibility, and capability gaps. Data engineers become responsible for the data gaps that are identified. Created ASOMs are informed by the intelligence planning process. ASOMs can be used to generate task and skill requirements of lower echelons that must be trained to and understood for accomplishing the mission. ASOs need a relevant IPB product that ideally contains actionable and current intelligence for an effective product ASOs should consistently seek to improve the performance and efficacy of the team

Set Operations

UNION - union of two sets A and B (A u B) INTERSECTION - Intersection of two sets A and B (A "upsidown u" B) DIFFERENCE - elements that are in A and not B. (A-B) COMPLEMENT - Any element not in A (A')

Data Relationships

UNIVARIATE - Data involving only one variable per observation. We can calculate the mean, median, mode, standard deviation to describe the data. BIVARIATE - includes two different measurements or labels for each data point. i.e. Height and Weight. MULTIVARIATE- includes two or more categories or measurements. For example, if we gathered gender information, we could compare the height distribution (mean, median, standard deviation)

Area of Influence (AI)

geographical area wherein a commander is directly capable of influencing operations by maneuver or fire support systems normally under the commander's command or control (JP 3-0)

Data Enrichment

includes adding additional features or context to provide more detail to your data. For example, SIEMS have reference tables with csv mapping of specific keys and values that allow the analyst to cross-reference data and infer meaning.

F-Score

method used to pleasure performance of a model by striking a balance between recall and precision.

Data Pipeline

plan that outlines how a team will collect, aggregate, store, and provide data to the analyst

SPLUNK Search Processing Language (SPL)

proprietary search language used in Splunk. Known for the pipe character that works similar to Powershell

Assumption

provides a supposition about the current situation or future course of events, presumed to be true in the absence of facts. Must be logical and realistic and necessary for planning to continue.

Confidence Intervals

range of values containing the value of a population parameter to a certain degree of surety

Specificity

ratio of True negatives and all negatives (Opposite of recall)

Specific Information Requirement (SIR)

specified data and processes needed to answer particular indicators.

Developing Research Questions

reserach process starts with developing a research question, a clearly stated objective, and some critical assumptions. A good research question includes the following: Clear, focused, concise, complex, and arguable. Open ended, not binary Complex enough to require analyiss and synthesis to generate at least one hypothesis flexible enough to allow for failure be answerable by data and facts available ASO SPECIFIC: "Constrained by these particular data sets and analysis tools, how can we detect a certain adversary technique?" "Does a particular technique or attack sequence generate anomalies that can be detected mathematically, with an acceptable rate for false positives?"

ACH Steps

1. List Hypotheses 2. List available evidence 3a. Build a matrix associating evidence to hypotheses. 3b. Combine perspectives 4. Refine the matrix 5. Draw tentative conclusions 6. Determine sensitivity to critical evidence 7. Report COnclusions 8 Identify milestones/indicators

Advanced Persistent Threat (APT)

Uses continuous, clandestine, and sophisticated hacking techniques to gain access to a system and remain inside for a prolonged period of time with potentially destructive consequences.

Critical Decision Making

Ability to make decisions without perfect information, knowing when enough information allows acceptable decisions, and the willingness to act on imperfect information.

Data Engineering

Able to tune collection scripts to ingest network logs and move data from one location to another leveraging data parsing to reformat data

Hypothesis Testing

Applying a statistical analysis to an observation to confirm or refute an assumption about the observation.

Area of Interest (AoI)

Area of concern to the commander, including the area of influence, areas adjacent thereto, and extending into enemy territory.

False Positive Paradox

Arises when the general attributes of a population lead to a reasonably "accurate" test procedure to produce far more false positives than true positives.

Data Types

CATEGORICAL - represent data characteristics. NUMERICAL - data measured or counted. -Discrete: indivisible chunks, like a number of students in a classroom -Continuous: like time, weight, etc. -Interval: ordered data but difference between value has meaning. -Ratio: similar to interval data but they also have a "true zero" such as height, weight, age, etc.

DATA TRANSPORT

CPTs may need to transport data as part of their operation over either a network connection or physical medium. Some commands for transferring over a network connection include: NC (Netcat) SCP (Secure Copy) RSYNC - can sync a directory between two devices.

Time Period SIRs

CPTs may suspect that an important event may have taken place prior to them establishing sensors on the network. They may request an SIR for a specific time period

DATA STORAGE

CPTs must be able to retain. an adequate amount of data throughout mission. It is important to take into account the length of time the data must be retained. CPTs can employ a hot, warm, cold storage strategy. This requires storing older data in a highly compressed way and newer storage in an easily usable format.

GIT HUB Common Commands

CREATING REPOSITORIES: mkdir /home/user/new_project cd /home/user/new_project git init CLONING REPOSITORIES: git clone <url> git clone https://github.com/mwaskom/seaborn COMMITTING CHANGES: git add CONTRIBUTING.md git commit -m "added guidance for contributors in CONTRIBUTING.md" BRANCHING AND MERGING: git branch - lists the current branches or creates a new one git checkout - transition to a new branch. will create one if one does nto exist git merge - applies the commits in one branch to the current working branch. will highlight and ask users to resolve any merge conflicts. git fetch - updates the local repo with data from remote branches. git pull - download changes from a remote branch to the current working branch git push - propagates changes to remote branches

Linear Algebra

Can be useful when working with large amounts of numerical data in a matrix.

Data Visualization Best Practices

Choose the best visuals for data and its purpose Ensure your data is easily understandable and viewable Offer necessary context for your audience in and around your visuals Keep your visual as simple and straightforward as possible Educate your audience with your visuals

DATA RETENTION

Depends on the length and scope of mission. BDPs may be used for larger data storage and retention.

Types of Analytic Techniques

DESCRIPTIVE - Describe a set of data EXPLORATORY - find unknown relationships INFERENTIAL- use sample data to draw conclusions of a larger population PREDICTIVE - using data on some objects to predict values on other objects CAUSAL- what happens to one variable when another changes MECHANISTIC- Understand the exact changes that leads to specific changes in individual objects

What is the CARVER Matrix?

Developed by the US Special Forces during Vietnam. Stands for: Criticality, Accessibility, Recuperability, Vulnerability, Effect, and Recognizability.

Measures of Variation

VARIANCE - how the population or sample differs from the mean Population Variance: measures the variation of all items differ from the actual population mean Sample Variance - two versions - 1. Actual variance of a particular sample 2. one that can be used to estimate the population variance STANDARD DEVIATION - measures the variation in the population/sample data STANDARD ERROR - variation in the sample statistic

Correlation Coefficients

Indicators of the strength of the linear relationship between two different variables. used to measure the strength of the linear relationship between two variables.

Induction vs Deduction vs Abduction

Induction: taking a specific occurrence and making an inference about a universal rule. Deduction: taking a known rule or generalization and applying that to the evidence at hand to make predictions or validate/invalidate that general rule. Abduction: applying your experience to limited available information to make a conclusion.

Named Area of Interest (NAI)

Locations in the network where the team expects to find artifacts

LUCENE

Look a lot like online search engine queries. KQL is based on Lucene. Dest_ip:8.8.8.,8 AND dest_port:443

Measures of Central Tendency

MEAN - arithmetic average MEDIAN - middle value of the sorted values MODE - Most commonly occurring value int he sample

Python Modules that help with Data Visuals

Matplotlib Seaborn Plotnone Bokeh Pygal Plotly geoplotlib Gleam

Creating an ASOM

Needs the following: Indicators relevant to threat for at least 2 CCIRs Indicators align with MITRE ATTACK Techniques Evidence relevant to threat for at least 5 indicators Evidence examples Data and action for 15 evidence that come from at least 3 different indicators Data should account for both local defender's current collection capabilities and gaps

Data Types and Visualizations

Nominal: Pie charts Ordinal: Percentiles, mean, mode, interquartile range Numerical - have the most potential for descritpion and visualization. i.e. Boxplots. Visualize distribution of data.

All-Source Intelligence Analyst (35F)

Prepares intelligence products to support the command and mission. Performs, confirms, or validates the Cyber intelligence preparation of the environment and the threat analysis. Coordinates with external support as needed.

Zero Day

Previously unknown exploit that is not yet detected by signature-based detection devices.

Data Normalization

Process of changing field names to correspond to a pre-determined data model. Failing to understand the data and the corresponding model will prevent you from drawing sound conclusions.

Collection Management

Process of ensuring the organization possesses the data and tools to answer the commander's information requirements. The CM is responsible for recognizing that situation and coordinating with higher echelons to remedy the situation when the needs are not met.

Request Management

Process of screening, tracking, and advocating for the timely and effective satisfaction of RFIs and RFS.

Analytic

Process or technique that refines data into information

EXPLOIT

Program or a piece of code designed to find and take advantage of a security flaw or vulnerability in an application or computer system.

Documentation and Comments

Programmers use it to tell each other about how their code functions.

Version Control Systems

Provide a number of valuable functions to help individuals or group of programmers track changes and collaborate on coding projects. KEY FUNCTIONS: 1. Track changes 2. Commit changes 3. Changesets - made during commit. Creates a code revision and is given a unique identifier as either an incremental version number or a unique hash value. 4. Get Updates - used to get the most up to date version 5. Identify and Resolve Conflicts - provides mechanisms for identifying when changes conflict. 6. View differences in file versions (DIFFING) - compares two different versions of the same file. 7. Branches and Merges - master file remains while different versions branch. Ultimately leads to a merge of all branches.

Sensitivity (Recall)

Ratio of True Positives to actual positives. Answers teh question "How many of the actual positive subjects are labeled correctly?" or "How likely is a positive subject going to receive the positive label?"

Accuracy

Ratio of correct predictions to the whole set. Answers the question "How many predictions made by the model are correct?" Calculation is A = TP + TN divided by (TP + FP + TN + FN)

Precision

Ration of True Positives to all predicted positives. Answers the question: "How many of the positive predictions are true?" Calculation is P = TP divided by (TP + FP)

Output of Analysis (Last Step)

Recommend actions to your team leader.

Network Technician

Responsible for installing and maintaining the connection between the CPT Defense Digital Services and the larger network.

Cyber Operations Planner

Responsible for operational planning and intelligence synchronization in order to support the technical scheme of maneuver for each missions.

Data Visualization (Static vs Interactive)

STATIC VISUALIZATION - charts or maps INTERACTIVE - lets people drill down TIME SERIES VISUALIZATION - visuals that track data, or performance, over a period of time. i.e. Line, bar, area, or bullet charts or graphs HISTOGRAM - provides a vidual representation of the distribution of a dataset: location, spread, and skewness of the data (LEFT or RIGHT, UNIMODAL, BIMODAL, or MULTIMODAL)

Types of Data Variety

STRUCTURED - have well-defined formats and properties UNSTRUCTURED - Lacks a well-defined data model SEMI-STRUCTURED - blend of structured and unstructured

Kibana Query Language (KBL)

Simple syntax for filtering Elastisearch data using free text search or field-based search. Able to suggest field names, values, and operators as you type. EXAMPLES: http.response.status_code: 400 401 404 http.response.body.content.text:quick brown fox http.response.body.content.text:"quick brown fox"zil

Graph Theory

Study of relationship between nodes. Shortest Path Algorithm: can find the shortest path between two points in a graph

Base Rate Fallacy

Tendency for people to ignore general information about a population over individuating information, even when the general information is a more reliable factor for decision making.

True Negative (TN)

Test correctly predicts that the subject does not have the attribute

True Positive (TP)

Test correctly predicts that the subject has the attribute

False Negative (FN)

Test incorrectly predicts that the subject does not have the attribute

False Positive (FP)

Test incorrectly predicts that the subject has the attribute

Important NOTE for Cyber Space Operations Layers

There is potentially an analog in each layer of cyberspace (physical, logical, and persona). EX: There is an AO in the physical layer and in the logical layer.

What is OAKOC?

Used to analyze military aspects of terrain. There are 5: OBSERVATION: ability to see or be seen by the adversary either visually or through surveillance devices. AVENUE OF APPROACH: Air or ground route of an attacking force of a given size leading to its objective or terrain (JP 2-01.3) KEY TERRAIN: Any locality or area, the seizure or retention of which affords a marked advantage to either combatant. Terrain whose seizure and retention is mandatory for successful mission accomplishment. OBSTACLE: Natural or man-made obstruction designed or employed to disrupt, fix, or block the movement of an opposing force, and to impose additional losses in personnel, time, and equipment. COVER: Protection from effects of Fire. CONCEALMENT: Protection from observation or surveillance.

5Ws of Analysis of Competing Hypothesis

WHAT WHO: Me Lead, All crew leas, and senior analysts WHEN: at the CPT Team Lead's discretion for any complex situation where the proximate cause is not immediate WHERE: created space between ongoing assessments and the complex, creative analysis of the ACH process. WHY: Allows us to conduct operations quicker and with more accuracy and to report conclusions fully.

Big Data

data that is impractical to work with using commodity hardware and application of basic statistics. Hallmarks: Volume - Sheer amount of data/information in terms of storage Velocity - Speed at which data is flowing into the ecosystem Variety - amount of different data types in the ecosystem Veracity - how well does the data reflect reality and how well can it predict future events? Validity - is the data uncorrupted/unaltered? Volatility - How long is data relevant, accurate, and useful?

Host Analyst

has immediate knowledge of enterprise services and the security and configuration of them.


संबंधित स्टडी सेट्स

Module 2: North American Geography

View Set

Investigator Obligations in FDA-Regulated Research Quiz

View Set

Cardiac Care for Children: Q and A's

View Set

comm 296 Nov.17,22 promotion strategy 1, 2

View Set