How Amazon Security Lake is helping customers simplify security data management for proactive threat analysis

TutoSartup excerpt from this article:
Security Lake centralizes security data from AWS environments, software as a service (SaaS) providers, and on-premises and cloud sources into a purpose-built data lake that is stored in your AWS account… Using Open Cybersecurity Schema Framework (OCSF) support, Security Lake normalizes and combin…

In this post, we explore how Amazon Web Services (AWS) customers can use Amazon Security Lake to efficiently collect, query, and centralize logs on AWS. We also discuss new use cases for Security Lake, such as applying generative AI to Security Lake data for threat hunting and incident response, and we share the latest service enhancements and developments from our growing landscape. Security Lake centralizes security data from AWS environments, software as a service (SaaS) providers, and on-premises and cloud sources into a purpose-built data lake that is stored in your AWS account. Using Open Cybersecurity Schema Framework (OCSF) support, Security Lake normalizes and combines security data from AWS and a broad range of third-party data sources. This helps provide your security team with the ability to investigate and respond to security events and analyze possible threats within your environment, which can facilitate timely responses and help to improve your security across multicloud and hybrid environments.

One year ago, AWS embarked on a mission driven by the growing customer need to revolutionize how security professionals centralize, optimize, normalize, and analyze their security data. As we celebrate the one-year general availability milestone of Amazon Security Lake, we’re excited to reflect on the journey and showcase how customers are using the service, yielding both productivity and cost benefits, while maintaining ownership of their data.

Figure 1 shows how Security Lake works, step by step. For more details, see What is Amazon Security Lake?

Figure 1: How Security Lake works

Customer use cases

In this section, we highlight how some of our customers have found the most value with Security Lake and how you can use Security Lake in your organization.

Simplify the centralization of security data management across hybrid environments to enhance security analytics

Many customers use Security Lake to gather and analyze security data from various sources, including AWS, multicloud, and on-premises systems. By centralizing this data in a single location, organizations can streamline data collection and analysis, help eliminate data silos, and improve cross-environment analysis. This enhanced visibility and efficiency allows security teams to respond more effectively to security events. With Security Lake, customers simplify data gathering and reduce the burden of data retention and extract, transform, and load (ETL) processes with key AWS data sources.

For example, Interpublic Group (IPG), an advertising company, uses Security Lake to gain a comprehensive, organization-wide grasp of their security posture across hybrid environments. Watch this video from re:Inforce 2023 to understand how IPG streamlined their security operations.

Before adopting Security Lake, the IPG team had to work through the challenge of managing diverse log data sources. This involved translating and transforming the data, as well as reconciling complex elements like IP addresses. However, with the implementation of Security Lake, IPG was able to access previously unavailable log sources. This enabled them to consolidate and effectively analyze security-related data. The use of Security Lake has empowered IPG to gain comprehensive insight into their security landscape, resulting in a significant improvement in their overall security posture.

“We can achieve a more complete, organization-wide understanding of our security posture across hybrid environments. We could quickly create a security data lake that centralized security-related data from AWS and third-party sources,” — Troy Wilkinson, Global CISO at Interpublic Group

Streamline incident investigation and reduce mean time to respond

As organizations expand their cloud presence, they need to gather information from diverse sources. Security Lake automates the centralization of security data from AWS environments and third-party logging sources, including firewall logs, storage security, and threat intelligence signals. This removes the need for custom, one-off security consolidation pipelines. The centralized data in Security Lake streamlines security investigations, making security management and investigations simpler. Whether you’re in operations or a security analyst, dealing with multiple applications and scattered data can be tough. Security Lake makes it simpler by centralizing the data, reducing the need for scattered systems. Plus, with data stored in your Amazon Simple Storage Service (Amazon S3) account, Security Lake lets you store a lot of historical data, which helps when looking back for patterns or anomalies indicative of security issues.

For example, SEEK, an Australian online employment marketplace, uses Security Lake to streamline incident investigations and reduce mean time to respond (MTTR). Watch this video from re:Invent 2023 to understand how SEEK improved its incident response. Prior to using Security Lake, SEEK had to rely on a legacy VPC Flow Logs system to identify possible indicators of compromise, which took a long time. With Security Lake, the company was able to more quickly identify a host in one of their accounts that was communicating with a malicious IP. Now, Security Lake has provided SEEK with swift access to comprehensive data across diverse environments, facilitating forensic investigations and enabling them scale their security operations better.

Optimize your log retention strategy

Security Lake simplifies data management for customers who need to store large volumes of security logs to meet compliance requirements. It provides customizable retention settings and automated storage tiering to optimize storage costs and security analytics. It automatically partitions and converts incoming security data into a storage and query-efficient Apache Parquet format. Security Lake uses the Apache Iceberg open table format to enhance query performance for your security analytics. Customers can now choose which logs to keep for compliance reasons, which logs to send to their analytics solutions for further analysis, and which logs to query in place for incident investigation use cases. Security Lake helps customers to retain logs that were previously unfeasible to store and to extend storage beyond their typical retention policy, within their security information and event management (SIEM) system.

Figure 2 shows a section of the Security Lake activation page, which presents users with options to set rollup AWS Regions and storage classes.

Figure 2: Security Lake activation page with options to select a roll-up Region and set storage classes

For example, Carrier, an intelligent climate and energy solutions company, relies on Security Lake to strengthen its security and governance practices. By using Security Lake, Carrier can adhere to compliance with industry standards and regulations, safeguarding its operations and optimizing its log retention.

“Amazon Security Lake simplifies the process of ingesting and analyzing all our security-related log and findings data into a single data lake, strengthening our enterprise-level security and governance practices. This streamlined approach has enhanced our ability to identify and address potential issues within our environment more effectively, enabling us to proactively implement mitigations based on insights derived from the service,” — Justin McDowell, Associate Director for Enterprise Cloud Services at Carrier

Proactive threat and vulnerability detection

Security Lake helps customers identify potential threats and vulnerabilities earlier in the development process by facilitating the correlation of security events across different data sources, allowing security analysts to identify complex attack patterns more effectively. This proactive approach shifts the responsibility for maintaining secure coding and infrastructure to earlier phases, enhancing overall security posture. Security Lake can also optimize security operations by automating safeguard protocols and involving engineering teams in decision-making, validation, and remediation processes. Also, by seamlessly integrating with security orchestration tools, Security Lake can help expedite responses to security incidents and preemptively help mitigate vulnerabilities before exploitation occurs.

For example, SPH Media, a Singapore media company, uses Security Lake to get better visibility into security activity across their organization, enabling proactive identification of potential threats and vulnerabilities.

Security Lake and generative AI for threat hunting and incident response practices

Centralizing security-related data allows organizations to efficiently analyze behaviors from systems and users across their environments, providing deeper insight into potential risks. With Security Lake, customers can centralize their security logs and have them stored in a standardized format using OCSF. OCSF implements a well-documented schema maintained at schema.ocsf.io, which generative AI can use to contextualize the structured data within Security Lake tables. Security personnel can then ask questions of the data using familiar security investigation terminology rather than needing data analytics skills to translate an otherwise simple question into complex SQL (Structured Query Language) queries. Building upon the previous processes, Amazon Bedrock then takes an organization’s natural language incident response playbooks and converts them into a sequence of queries capable of automating what would otherwise be manual investigation tasks.

“Organizations can benefit from using Amazon Security Lake to centralize and manage security data with the OCSF and Apache Parquet format. It simplifies the integration of disparate security log sources and findings, reducing complexity and cost. By applying generative AI to Security Lake data, SecOps can streamline security investigations, facilitate timely responses, and enhance their overall security,” — Phil Bues, Research Manager, Cloud Security at IDC

Enhance incident response workflow with contextual alerts

Incident responders commonly deal with a high volume of automatically generated alerts from their IT environment. Initially, the response team members sift through numerous log sources to uncover relevant details about affected resources and the causes of alerts, which adds time to the remediation process. This manual process also consumes valuable analyst time, particularly if the alert turns out to be a false positive.

To reduce this burden on incident responders, customers can use generative AI to perform initial investigations and generate human-readable context from alerts. Amazon QuickSight Q is a generative AI–powered assistant that can answer questions, provide summaries, generate content, and securely complete tasks based on data and information in your enterprise systems. Incident responders can now combine these capabilities with data in Amazon Security Lake to begin their investigation and build detailed dashboards in minutes. This allows them to visually identify resources and activities that could potentially trigger alerts, giving incident responders a head start in their response efforts. The approach can also be used to validate alert information against alternative sources, enabling faster identification of false positives and allowing responders to focus on genuine security incidents.

As natural language processing (NLP) models evolve they can also interpret incident response playbooks, often written in a human-readable, step-by-step format. Amazon Q Apps will enable the incident responder to use natural language to quickly and securely build their own generative AI playbooks to automate their tasks. This automation can streamline incident response tasks such as creating tickets in the relevant ticketing system, logging bugs in the version control system that hosts application code, and tagging resources as non-compliant if they fail to adhere to the organization’s IT security policy.

Incident investigation with automatic dashboard generation

By using Amazon QuickSight Q and data in Security Lake, incident responders can now generate customized visuals based on the specific characteristics and context of the incident they are investigating. The incident response team can start with a generic investigative question and automatically produce dashboards without writing SQL queries or learning the complexities of how to build a dashboard.

After building a topic in Amazon QuickSight Q. The investigator can either ask their own question or utilize the AI generated questions that Q has provided. In the example shown in Figure 3, the incident responder suspects a potential security incident and is seeking more information about Create or Update API calls in their environment, by asking ” I’m looking for what accounts ran an update or create api call in the last week, can you display the API Operation?

Figure 3: AI Generated visual to begin the investigation

This automatically generated dashboard can be saved for later and from this dashboard the investigator can just focus on one account with a simple right click and get more information about the API calls performed by a single account. as shown in Figure 4.

Figure 4: Investigating API calls for a single account

Using the centralized security data in Amazon Security Lake and Amazon QuickSight Q, incident responders can jump-start investigations and rapidly validate potential threats without having to write complex queries or manually construct dashboards.

Updates since GA launch

Security Lake makes it simpler to analyze security data, gain a more comprehensive understanding of security across your entire organization, and improve the protection of your workloads, applications, and data. Since the general availability release in 2023, we have made various updates to the service. It’s now available in 17 AWS Regions globally. To assist you in evaluating your current and future Security Lake usage and cost estimates, we’ve introduced a new usage page. If you’re currently in a 15-day free trial, your trial usage can serve as a reference for estimating post-trial costs. Accessing Security Lake usage and projected costs is as simple as logging into the Security Lake console.

We have released an integration with Amazon Detective that enables querying and retrieving logs stored in Security Lake. Detective begins pulling raw logs from Security Lake related to AWS CloudTrail management events and Amazon VPC Flow Logs. Security Lake enhances analytics performance with support for OCSF 1.1.0 and Apache Iceberg. Also, Security Lake has integrated several OCSF mapping enhancements, including OCSF Observables, and has adopted the latest version of the OCSF datetime profile for improved usability. Security Lake seamlessly centralizes and normalizes Amazon Elastic Kubernetes Service (Amazon EKS) audit logs, simplifying monitoring and investigation of potential suspicious activities in your Amazon EKS clusters, without requiring additional setup or affecting existing configurations.

Developments in Partner integrations

Over 100 sources of data are available in Security Lake from AWS and third-party sources. Customers can ingest logs into Security Lake directly from their sources. Security findings from over 50 partners are available through AWS Security Hub, and software as a service (SaaS) application logs from popular platforms such as Salesforce, Slack, and Smartsheet can be sent into Security Lake through AWS AppFabric. Since the GA launch, Security Lake has added 18 partner integrations, and we now offer over 70 direct partner integrations. New source integrations include AIShield by Bosch, Contrast Security, DataBahn, Monad, SailPoint, Sysdig, and Talon. New subscribers who have built integrations include Cyber Security Cloud, Devo, Elastic, Palo Alto Networks, Panther, Query.AI, Securonix, and Tego Cyber and new service partner offerings from Accenture, CMD Solutions, part of Mantel Group, Deloitte, Eviden, HOOP Cyber, Kyndryl, Kudelski Security, Infosys, Leidos, Megazone, PwC and Wipro.

Developments in the Open Cybersecurity Schema Framework

The OCSF community has grown over the last year to almost 200 participants across security-focused independent software vendors (ISVs), government agencies, educational institutions, and enterprises. Leading cybersecurity organizations have contributed to the OCSF schema, such as Tanium and Cisco Duo. Many existing Security Lake partners are adding support for OCSF version 1.1, including Confluent, Cribl, CyberArk, DataBahn, Datadog, Elastic, IBM, Netskope, Orca Security, Palo Alto Networks, Panther, Ripjar, Securonix, SentinelOne, SOC Prime, Sumo Logic, Splunk, Tanium, Tego Cyber, Torq, Trellix, Stellar Cyber, Query.AI, Swimlane, and more. Members of the OCSF community who want to validate their OCSF schema for Security Lake have access to a validation script on GitHub.

Get help from AWS Professional Services

The global team of experts in the AWS Professional Services organization can help customers realize their desired business outcomes with AWS. Our teams of data architects and security engineers collaborate with customer security, IT, and business leaders to develop enterprise solutions.

The AWS ProServe Amazon Security Lake Assessment offering is a complimentary workshop for customers. It entails a two-week interactive assessment, delving deep into customer use cases and creating a program roadmap to implement Security Lake alongside a suite of analytics solutions. Through a series of technical and strategic discussions, the AWS ProServe team analyzes use cases for data storage, security, search, visualization, analytics, AI/ML, and generative AI. The team then recommends a target future state architecture to achieve the customer’s security operations goals. At the end of the workshop, customers receive a draft architecture and implementation plan, along with infrastructure cost estimates, training, and other technical recommendations.

Summary

In this post, we showcased how customers use Security Lake to collect, query, and centralize logs on AWS. We discussed new applications for Security Lake and generative AI for threat hunting and incident response. We invite you to discover the benefits of using Security Lake through our 15-day free trial and share your feedback. To help you in getting started and building your first security data lake, we offer a range of resources including an infographic, eBook, demo videos, and webinars. There are many different use cases for Security Lake that can be tailored to suit your AWS environment. Join us at AWS re:Inforce 2024, our annual cloud security event, for more insights and firsthand experiences of how Security Lake can help you centralize, normalize, and optimize your security data.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on X.

How Amazon Security Lake is helping customers simplify security data management for proactive threat analysis
Author: Nisha Amthul