Lesson 12: Ensuring Network Availability

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

802.1p

IEEE standard defining a 3-bit (0 to 7) class of service priority field within the 802.1Q format format. While DiffServ works at layer 3, IEEE 802.1p can be used at Layer 2 (independently or in conjunction with DiffServ) to classify and prioritize traffic passing over a switch or wireless access point. 802.1p defines a tagging mechanism within the 802.1Q VLAN field (it also often referred to as 802.1Q/p). The 3-bit priority field is set to a value between 0 and 7. Most vendors map DSCP values to 802.1p ones. For example, 7 and 6 can be reserved for network control (such as routing table updates), 5 and 4 map to expedited forwarding levels for 2-way communications, 3 and 2 map to assured forwarding for streaming multimedia, and 1 and 0 for "ordinary" best-effort delivery. (COS mechanisms)

Performance/Traffic Logs

Record statistics for compute, storage, and network resources over a defined period.

sshd (Command)

Start the SSH Daemon (server). Parameters such as the host's certificate file, port to listen on, and logging options can be set via switches or in a configuration file.

Quality of Service (QoS)

Systems that differentiate data passing over the network that can reserve bandwidth for particular applications. A system that cannot guarantee a level of available bandwidth is often described as Class of Service (CoS). protocols and appliances are designed to support real-time services. Applications such as voice and video that carry real-time data have different network requirements to the sort of data represented by file transfer. With "ordinary" data, it might be beneficial to transfer a file as quickly as possible, but the sequence in which the packets are delivered and the variable intervals between packets arriving do not materially affect the application. This type of data transfer is described as bursty.

bandwidth (throughput)

The amount of data that can be transferred in a given time period. his is the rated speed of all the interfaces available to the device, measured in Mbps or Gbps. For wired Ethernet links, this will not usually vary, but the _________ of WAN and wireless links can change over time.

bottleneck

A point of poor performance that reduces the productivity of the whole network. may occur because a device is underpowered or faulty. It may also occur because of user or application behavior. you need to identify where and when on the network overutilization or excessive errors occur. If the problem is continual, it is likely to be device-related; if the problem only occurs at certain times, it is more likely to be user- or application-related.

System and Application Logs

A system log records startup events plus subsequent changes to the configuration at an OS level. This will certainly include kernel processes and drivers but could also include core services. By contrast, an application log records data for a single specific service, such as DNS, HTTP, or an RDBMS. Note that a complex application could write to multiple log files, however. For example, the Apache web server logs errors to one file and access attempts to another.

Remote Desktop Protocol (RDP)

Application protocol for operating remote connections to a host using a graphical interface. The protocol sends screen data from the remote host to the client and transfer mouse and keyboard input from the client to the remote host. It uses TCP port 3389. mainly used for the remote administration of a Windows server or client, but another function is to publish software applications on a server, rather than installing them locally on each client (application virtualization).

Secure Shell (SSH)

Application protocol supporting secure tunneling and remote terminal emulation and file copy. runs over TCP port 22. Provides terminal emulation for command line shell

Telnet

Application protocol supporting unsecure terminal emulation for remote host management. Telnet runs over TCP port 23. is both a protocol and a terminal emulation software tool that transmits shell commands and output between a client and the remote host. In order to support Telnet access, the remote computer must run a service known as the Telnet Daemon. The Telnet Daemon listens on TCP port 23 by default. interface can be password protected but the password and other communications are not encrypted and therefore could be vulnerable to packet sniffing and replay Is not used any more so make sure service is uninstalled or disabled, and block access to port 23.

Simple Network Management Protocol (SNMP)

Application protocol used for monitoring and managing network devices. Works over UDP ports 161 and 162 by default

Audit Logs

Use of authentication and authorization privileges. It will generally record success/fail type events. An audit log might also be described as an access log or security log. Audit logging might be performed at an OS level and at a per-application level.

SSH Client Authentication

Username/password -The client submits credentials that are verified by the SSH server either against a local user database or using a network authentication server. Public key authentication-Each remote user's public key is added to a list of keys authorized for each local account on the SSH server. Kerberos-The client submits the Kerberos credentials (a Ticket Granting Ticket) obtained when the user logged onto the workstation to the server using the Generic Security Services Application Program Interface (GSSAPI). The SSH server contacts the Ticket Granting Service (in a Windows environment, this will be a domain controller) to validate the credential.

iPerf

Utility used to measure the bandwidth achievable over a network link.

Jitter

Variation in the time it takes for a signal to reach the recipient. Jitter manifests itself as an inconsistent rate of packet delivery. If packet loss or delay is excessive, then noticeable audio or video problems (artifacts) are experienced by users. defined as being a variation in the delay. Jitter manifests itself as an inconsistent rate of packet delivery. Jitter is also measured in milliseconds, using an algorithm to calculate the value from a sample of transit times. Latency and jitter are not significant problems when data transfer is bursty You can test the latency of a link using tools such as ping, pathping, and mtr. You can also use mtr to calculate jitter. When assessing latency, you need to consider the Round Trip Time (RTT). VoIP is generally expected to require an RTT of less than 300 ms. Jitter should be 30 ms or less. The link should also not exhibit more than 1 percent packet loss.

Syslog severity levels

0: Emergency. - the system is unusable (kernal panic) 1. Alert - A fault requiring immediate remediation has occurred 2. Critical - A fault that will require immediate remediation is likely to develop 3. Error - A nonurgent fault has developed 4. Warning - A nonurgent fault is likely to develop 5. Notice - A state that could potentially lead to an error condition has developed 6. Informational - A normal but reportable event has occurred 7. Debug - Verbose status conditions used during development and testing if the logging level for remote forwarding is set to 4, events that are level 5, 6, or 7 are not forwarded. An automated event management system can be configured to generate some sort of alert when certain event types of a given severity are encountered. Alerts can also be generated by setting thresholds for performance counters. Examples include packet loss, link bandwidth drops, number of sessions established, delay/jitter in real-time applications, and so on. Most network monitors also support heartbeat tests so that you can receive an alert if a device or server stops responding to probes. Setting alerts is a matter of balance. On the one hand, you do not want performance to deteriorate to the point that it affects user activity; on the other hand, you do not want to be overwhelmed by alerts.

TROUBLESHOOTING INTERFACE ERRORS

1. Cyclic Redundancy Check Errors A cyclic redundancy check (CRC) is calculated by an interface when it sends a frame. A CRC value is calculated from the frame contents to derive a 32-bit value. This is added to the header as the frame check sequence. The receiving interface uses the same calculation. If it derives a different value, the frame is rejected. The number of CRC errors can be monitored per interface. CRC errors are usually caused by interference. This interference might be due to poor quality cable or termination, attenuation, mismatches between optical transceivers or cable types, or due to some external factor. 2. Encapsulation Errors Encapsulation is the frame format expected on the interface. Encapsulation errors will prevent transmission and reception. If you check the interface status, the physical link will be listed as up, but the line protocol will be listed as down. This type of error can arise in several circumstances: - Ethernet frame type-Ethernet can use a variety of frame types. The most common is Ethernet II, but if a host is configured to use a different type, such as SNAP, then errors will be reported on the link. - Ethernet trunks-When a trunk link is established between two switches, it will very commonly use the Ethernet 802.1Q frame format. 802.1Q specifies an extra frame header to carry a VLAN ID and type of service data. If one switch interface is using 802.1Q but the other is not, this may be reported as an encapsulation error. -WAN framing-Router interfaces to provider networks can use a variety of frame formats. Often these are simple serial protocols, such as High-level Data Link Control (HDLC) or Point-to-Point Protocol (PPP). Alternatively, the interface may use encapsulated Ethernet over Asynchronous Transfer Mode (ATM) or Virtual Private LAN Service (VPLS) or an older protocol, such as Frame Relay. The interface on the Customer Edge (CE) router must be configured for the same framing type as the Provider Edge (PE) router. 3. Runt Frame Errors A runt is a frame that is smaller than the minimum size (64 bytes for Ethernet). A runt frame is usually caused by a collision. In a switched environment, collisions should only be experienced on an interface connected to a legacy hub device and there is a duplex mismatch in the interface configuration (or possibly on a misconfigured link to a virtualization platform). If runts are generated in other conditions, suspect a driver issue on the transmitting host. 4. Giant Frame Errors A giant is a frame that is larger than the maximum permissible size (1518 bytes for Ethernet II). There are two likely causes of giant frames: - Ethernet trunks-As above, if one switch interface is configured for 802.1Q framing, but the other is not, the frames will appear too large to the receiver, as 802.1Q adds 4 bytes to the header, making the maximum frame size 1522 bytes. An Ethernet frame that is slightly larger (up to 1600 bytes) is often referred to as a baby giant. 5. Jumbo frames-A host might be configured to use jumbo frames, but the switch interface is not configured to receive them. This type of issue often occurs when configuring storage area networks (SANs) or links between SANs and data networks.

INTERFACE MONITORING METRICS

1. Link state-Measures whether an interface is working (up) or not (down). You would configure an alert if an interface goes down so that it can be investigated immediately. You may also want to track the uptime or downtime percentage so that you can assess a link's reliability over time. 2. Resets-The number of times an interface has restarted over the counter period. Interfaces may be reset manually or could restart automatically if traffic volume is very high, or a large number of errors are experienced. Anything but occasional resets should be closely monitored and investigated. An interface that continually resets is described as flapping. 3. Speed-This is the rated speed of the interface, measured in Mbps or Gbps. For wired Ethernet links this will not usually vary, but the bandwidth of WAN and wireless links may change over time. For Ethernet links, the interface speed should be the same on both the host and switch ports. 4. Duplex-Most Ethernet interfaces operate in full duplex mode. If an interface is operating in half duplex mode, there is likely to be some sort of problem, unless you are supporting a legacy device. 5. Utilization-The data transferred over a period. This can either be measured as the amount of data traffic both sent and received (measured in bits or bytes per second or a multiple thereof) or calculated as a percentage of the available bandwidth. 6. Per-protocol utilization-Packet or byte counts for a specific protocol. It is often useful to monitor both packet counts and bandwidth consumption. High packet counts will incur processing load on the CPU and system memory resources of the appliance, even if the size of each packet is quite small. 7.Error rate-The number of packets per second that cause errors. Errors may occur as a result of interference or poor link quality causing data corruption in frames. In general terms, error rates should be under 1 percent; high error rates may indicate a driver problem, if a network media problem can be ruled out. 8.Discards/drops-An interface may discard incoming and/or outgoing frames for several reasons, including checksum errors, mismatched MTUs, packets that are too small (runts) or too large (giants), high load, or permissions- the sender is not on the interface's access control list (ACL) or there is some sort of VLAN configuration problem, for instance. Each interface is likely to class the type of discard or drop separately to assist with troubleshooting the precise cause. 9. Retransmissions-Errors and discards/drops mean that frames of data are lost during transmission between two devices. As a result, the communication will be incomplete, and the data will, therefore, have to be retransmitted to ensure application data integrity. If you observe high levels of retransmissions (as a percentage of overall traffic), you must analyze and troubleshoot the specific cause of the underlying packet loss, which could involve multiple aspects of network configuration and connectivity.

TRAFFIC ANALYSIS TOOLS

1. throughput Testers One fairly simple way to measure network throughput is to transfer a large file between two appropriate hosts. Appropriate in this sense means an appropriate subnet and representative of servers and workstations that you want to measure. It is also important to choose a representative time. There is not much point in measuring the throughput when the network is carrying no other traffic. To determine your network throughput using this method, simply divide the file size by the amount of time taken to copy the file. For example, if you transfer a 1 GB file in half an hour, the throughput can be calculated as follows: 1 gigabyte is 1024 megabytes (1,073,741,824 bytes or 8,589,934,592 bits). 8,589,934,592 bits in 1,800 seconds is 4,772,186 bits per second or 4.55 Mbps. 2. Top Talkers/Listeners Top talkers are interfaces generating the most outgoing traffic (in terms of bandwidth), while top listeners are the interfaces receiving the most incoming traffic. Identifying these hosts and the routes they are using is useful in identifying and eliminating performance bottlenecks. Most network analyzer software comes with filters or built-in reporting to identify top talkers or top listeners. 3. Bandwidth Speed Testers In addition to testing performance on a local network, you may also want to test Internet links using some type of bandwidth speed tester. There are many Internet tools available for checking performance. The two main classes are: Broadband speed checkers-These test how fast the local broadband link to the Internet is. They are mostly designed for SOHO use. The tool will test downlink and uplink speeds, test latency using ping, and can usually compare the results with neighboring properties and other users of the same ISP. Website performance checkers-These query a nominated website to work out how quickly pages load. One of the advantages of an online tool is that you can test your site's response times from the perspective of customers in different countries.

SSH host keys

An SSH server is identified by a public/private key pair, referred to as the host key. A mapping of host names to public keys can be kept manually by each SSH client, or there are various enterprise software products designed for SSH key management.

Traffic shapers

Appliances and/or software that enable administrators to closely monitor network traffic and to manage that network traffic. The primary function of a traffic shaper is to optimize network media throughput to get the most from the available bandwidth.

Network Time Protocol (NTP)

Application protocol allowing machines to synchronize to the same time clock that runs over UDP port 123. Top-level NTP servers (stratum 1) obtain the Coordinated Universal Time (UTC) via a direct physical link to an accurate clock source, such as an atomic clock accessed over the General Positioning System (GPS). An NTP server that synchronizes its time with a stratum 1 server over a network is operating at stratum 2. Each stratum level represents a step away from the accurate clock source over a network link. These lower stratum servers act as clients of the stratum 1 servers and as servers or time sources to lower stratum NTP servers or client hosts. Most switches and routers can be configured to act as time servers to local client hosts and this function is also typically performed by network directory servers. It is best to configure each of these devices with multiple reference time sources (at least three) and to establish them as peers to allow the NTP algorithm to detect drifting or obviously incorrect time values.

NetFlow

Cisco-developed means of reporting network flow information to a structured database. NetFlow allows better understanding of IP traffic flows as used by different network applications and hosts. Using NetFlow involves deploying three types of components: 1. A NetFlow exporter is configured on network appliances (switches, routers, and firewalls). Each flow is defined on an exporter. A traffic flow is defined by packets that share the same characteristics, such as IP source and destination addresses and protocol type. These five bits of information are referred to as a 5-tuple. A 7-tuple flow adds the input interface and IP type of service data. Each exporter caches data for newly seen flows and sets a timer to determine flow expiration. When a flow expires or becomes inactive, the exporter transmits the data to a collector. 2.A NetFlow collector aggregates flows from multiple exporters. A large network can generate huge volumes of flow traffic and data records, so the collector needs a high bandwidth network link and substantial storage capacity. The exporter and collector must support compatible versions of NetFlow and/or IPFIX. The most widely deployed versions of NetFlow are v5 and v9. 3. A NetFlow analyzer reports and interprets information by querying the collector and can be configured to generate alerts and notifications. In practical terms, the collector and analyzer components are often implemented as a single product.

ssh-agent

Configure a service to use to store the keys used to access multiple hosts. The agent stores the private key for each public key securely and reduces the number of times use of a private key has to be confirmed with a passphrase. This provides a single sign-on (SSO) mechanism for multiple SSH servers. The ssh-add command is used to add a key to the agent.

ssh-keygen (Command)

Create a key pair to use to access servers. The private key must be stored securely on your local computer. The public key must be copied to the server. You can use the ssh-copy-id command to do this, or you can copy the file manually.

CPU and memory

Devices such as switches and routers perform a lot of processing. If ___________ utilization (measured as a percentage) is very high, an upgrade might be required. High ________ utilization can also indicate a problem with network traffic.

environmental sensor

Distinct from performance monitors, an _______ is used to detect factors that could threaten the integrity or availability of an appliance or its function. Temperature-High temperature will make it difficult for device and rack cooling systems to dissipate heat effectively. This increases the risk of overheating of components within device chassis and consequent faults. Humidity-More water vapor in the air risks condensation forming within a device chassis, leading to corrosion and short circuit faults. Conversely, very low humidity increases risks of static charges building up and damaging components. Electrical-Computer systems need a stable power supply, free from outages (blackouts), voltage dips (brownouts), and voltage spikes and surges. Sensors built into power distribution systems and backup battery systems can report deviations from a normal power supply. Flooding-There may be natural or person-made flood risks from nearby water courses and reservoirs or risks from leaking plumbing or fire suppression systems. Electrical systems need to be shut down immediately in the presence of any significant amount of water.

Differentiated Services (DiffServ)

Framework classifies each packet passing through a device. Router policies can then be defined to use the packet classification to prioritize delivery. DiffServ is an IP (Layer 3) service tagging mechanism. It uses the Type of Service field in the IPv4 header (Traffic Class in IPv6). The field is populated with a 6-byte DiffServ Code Point (DSCP) by either the sending host or by the router. Packets with the same DSCP and destination are referred to as Behavior Aggregates and allocated the same Per Hop Behavior (PHB) at each DiffServ-compatible router. DiffServ traffic classes are typically grouped into three types: Best Effort. Assured Forwarding (which is broken down into sub-levels). Expedited Forwarding (which has the highest priority). (COS mechanisms)

Bandwidth

Generally used to refer to the amount of data that can be transferred through a connection over a given period. Bandwidth more properly means the range of frequencies supported by transmission media, measured in Hertz. the amount of information that can be transmitted, measured in bits per second (bps), or some multiple thereof. When monitoring, you need to distinguish between the nominal data link/Ethernet bit rate, the throughput of a link at Layer 3, and the throughput available to an application. The voice frequency range is 4000 Hz. This must be sampled at twice the rate to ensure an accurate representation of the original analog waveform. The sample size is 1 byte (or 8 bits). Therefore, 8 KHz x 8 bits = 64 Kbps. For VoIP, bandwidth requirements for voice calling can vary, but allowing 100 Kbps per call upstream and downstream should be sufficient in most cases. Bandwidth required for video is determined by image resolution (number of pixels), color depth, and the frame rate, measured in frames per second (fps).

bandwidth speed tester

Hosted utility used to measure actual speed obtained by an Internet link to a representative server or to measure the response times of websites from different locations on the Internet.

Storage

It keep configuration information and logs. _________ is measured in MB or GB. If the device runs out of ________ it could cause serious errors. Servers also depend on fast input/output (I/O) to run applications efficiently.

Performance Metrics

Measurement of a value affecting system performance, such as CPU or memory utilization.

interface statistics

Metrics recorded by a host or switch that enable monitoring of link state, resets, speed, duplex setting, utilization, and error rates.

LOG REVIEWS

Monitoring involves viewing traffic, protocols, and events in real time. Network and log reviewing, or analysis involves later inspection and interpretation of captured data to determine what the data shows was happening on the network during the capture. Monitoring is aligned with incident response; analysis is aligned with investigating the cause of incidents or preventing incidents in the first place. It is important to perform performance analysis and log review continually. Referring to the logs only after a major incident is missing the opportunity to identify threats and vulnerabilities or performance problems early and to respond proactively. Not all performance incidents will be revealed by a single event. One of the features of log analysis and reporting software should be to identify trends. A trend is difficult to spot by examining each event in a log file. Instead, you need software to chart the incidence of types of events and show how the number or frequency of those events changes over time. Plotting data as a graph is particularly helpful as it is easier to spot trends or spikes or troughs in a visualization of events, rather than the raw data. Most performance monitors can plot metrics in a graph.

Latency

The time it takes for a signal to reach the recipient. A video application can support a latency of about 80 ms, while typical latency on the Internet can reach 1000 ms at peak times. Latency is a particular problem for 2-way applications, such as VoIP (telephone) and online conferencing.

SNMP Agents

a process (software or firmware) running on a switch, router, server, or other SNMP-compatible network device. This agent maintains a database called a Management Information Base (MIB) that holds statistics relating to the activity of the device, such as the number of frames per second handled by a switch. Each parameter stored in a MIB is referred to by a numeric Object Identifier (OID). OIDs are stored within a tree structure. Part of the tree is generic to SNMP, while part can be defined by the device vendor. An agent is configured with the Community Name of the computers allowed to manage the agent and the IP address or host name of the server running the management system. The community name acts as a rudimentary type of password. An agent can pass information only to management systems configured with the same community name. There are usually two community names; one for read-only access and one for read-write access (or privileged mode).

terminal emulator

any kind of software that replicates this TTY input/output function application might support connections to multiple types of shell Software that enables a standard client computer to appear to a host computer as a dedicated terminal.

Performance Baselines

establishes the resource utilization metrics at a point in time, such as when the system was first installed. This provides a comparison to measure system responsiveness later. For example, if a company is expanding a remote office that is connected to the corporate office with an ISP's basic tier package, the baseline can help determine if there is enough reserve bandwidth to handle the extra user load, or if the basic package needs to be upgraded to support higher bandwidths.

Syslog

example of a protocol and supporting software that facilitates log collection. It has become a de facto standard for logging events from distributed systems. For example, syslog messages can be generated by Cisco® routers and switches, as well as UNIX or Linux servers and workstations. A syslog collector usually listens on UDP port 514. provides an open format for event data. A syslog message comprises a PRI code, a header containing a timestamp and host name, and a message part. The PRI code is calculated from the facility and a severity level. The message part contains a tag showing the source process plus content. The format of the content is application dependent. It might use space- or comma-delimited fields or name/value pairs, such as JavaScript Object Notation (JSON) data.

SNMP Monitor

is management software that provides a location from which you can oversee network activity. The monitor polls agents at regular intervals for information from their MIBs and displays the information for review. It also displays any trap operations as alerts for the network administrator to assess and act upon as necessary. The monitor can retrieve information from a device in two main ways: Get-The software queries the agent for a single OID. This command is used by the monitor to perform regular polling (obtaining information from devices at defined intervals). Trap-The agent informs the monitor of a notable event (port failure, for instance). The threshold for triggering traps can be set for each value.

VLAN infrastructure

is often used for traffic management on local networks. For example, voice traffic might be allocated to a different VLAN than data traffic.


Kaugnay na mga set ng pag-aaral

Chapter 28 Medication Management

View Set

Medical Laboratory Science Review Harr 8.2 Molecular Diagnostics: Molecular Diagnostics

View Set

Section 4.1 Part 1: Introduction to Fractions, Improper Fractions, and Mixed Numbers

View Set

Introduction to Nursing practice questions chapter 19

View Set