AWS re: Invent 2020, Keynotes – Analytics + Networking

Another batch of announcements and new products from the annual large-scale cloud conference AWS re: Invent 2020. This time in the field of analytics and network infrastructure. Many features are already available for preview, which ones – read below. More details will be discussed by AWS architects in a Russian-language tweet that they regularly hold during re: Invent days. Link to the twitch stream at the end of the article.

Analytics

AWS Lake Formation New Features (Preview)

New functionality AWS Lake Formation: Transactions, row-level security and performance improvements are available for preview. The functionality works through new, open and public APIs for updating and accessing data lakes.

Transactions are implemented using “governed tables” a new type of table based on Amazon S3 that supports ACID transactions. Transactions simplify data transformation scripts (ETL) and enable different users to add, delete, and modify records in different managed tables at the same time and with assurance.

AWS Lake Formation automatically compresses and optimizes managed table storage in the background to improve performance when querying data.

More details here

Redshift

RA3.xlplus nodes and additional announcements for Amazon Redshift

RA3.xlplus is the third and smallest node type in the RA3 family. RA3 allows you to scale compute and storage separately, expanding the compute choices for Amazon Redshift clusters.

More details here

Ability to move a cluster between Availability Zones (AZ)

The move cluster feature moves a cluster to another AZ in one step without the need to make changes to the application. When a cluster is moved to a different AZ, the new cluster will have the same endpoint so that applications can continue to run unchanged. The feature is free and available for RA3 clusters.

More details here

Automatic table optimization

Automatic Table Optimization continuously monitors how queries interact with tables and uses machine learning to select the best sort and distribution keys to optimize query performance across the cluster.

More details here

Sharing Data Between Amazon Redshift Clusters (Preview)

A new data sharing feature in Amazon Redshift is available for trial, which allows you to securely and easily share data between Redshift clusters in real time. Sharing data allows you to simplify data processing, increase productivity and reduce costs – everything that you are used to within a single Redshift cluster is now available in multiple clusters while working on data.

By leveraging a managed datastore separate from the compute nodes of the RA3 family, instant, high-performance access to data from multiple clusters is possible without having to copy or move data. Reading outdated data is also excluded – all clusters work on a single, always up-to-date copy of the data, with all the latest changes. There is no additional cost to share data across Amazon Redshift clusters.

More details here

Amazon Redshift and Amazon RDS for MySQL and Amazon Aurora MySQL databases for federated queries (pre-release)

Amazon Redshift federated queries allow you to connect data from transactional databases for BI and reporting applications for operational analytics. The Amazon Redshift Optimizer offloads and distributes some of the computation to remote databases to accelerate performance by reducing network traffic. Today, we are expanding the federated query capabilities of Amazon RDS for MySQL and Amazon Aurora for MySQL. The function is available for preview.

Built-in JSON support (preview)

Today we are introducing native JSON and semi-structured data support in Amazon Redshift for a preview. For storage, a new data type ‘SUPER’ is used which allows you to store semi-structured data in Redshift tables. Also added support for query language PartiQL to request and process such data.

More details here

Amazon EMR Studio Preview

Amazon EMR Studio, a Jupyter-based IDE, has been announced. It supports authentication with enterprise SSO providers and enables analysts and data engineers to develop analytical applications and data processing systems in R, Python, Scala, and PySpark. Spark UI and YARN Timeline Service are also available to facilitate debugging. EMR Studio laptops will run on existing EMR clusters, or launch new ones using out-of-the-box CloudFormation templates for EMR.

Details here

Amazon EMR on Amazon EKS

With the new EMR deployment method (Amazon EMR on Amazon EKS), customers can automate the creation and management of open source big data frameworks powered by Amazon EKS. Customers can now run Spark applications in conjunction with other types of applications within the same EKS cluster and gain improvements in resource utilization and ease of infrastructure management.

Amazon EMR automatically packages your application into a big-data container and provides out-of-the-box connectors for integration with other AWS services. Then, EMR deploys the application to the EKS cluster and manages the logging and monitoring. By using EMR on EKS, you can receive 3 times higher productivityusing the performance-optimized Spark runtime included in EMR versus the standard Apache Spark on EKS.

More details here

Networking

VPC Reachability Analyzer

The new VPC Reachability Analyzer service allows you to diagnose network availability between two traffic points (endpoints) without the need to send network packets. The service reads the configuration of all resources in the VPC and uses automatic reasoning to determine the available network traffic paths. It analyzes all possible traffic paths within the network without sending network packets. To learn more about how automatic analysis algorithms work, see re: Invent session or read this document

More details here

AWS Transit Gateway Connect

Overlay SD-WANs (Software Defined Wide Area Networks) are used to connect offices or data centers over the public Internet. Cloud infrastructure is now required to be connected to the same networks. AWS Transit Gateway is often used at the edge of the network to connect their networks to the AWS backbone.

And with the addition of AWS Transit Gateway Connect functionality, there is an easy way to expand your SD-WAN infrastructure into the AWS Cloud. Instead of multiple IPsec VPN tunnels between Transit Gateway and SD-WAN network devices, Transit Gateway Connect uses GRE tunnels. It also supports dynamic BGP routing, integrates with AWS Transit Gateway Network Manager monitoring service, and a set of partner solutions

All of this simplifies network design, improves performance, and makes it easier to expand SD-WANs to AWS.

More details here

IGMP support in AWS Transit Gateway

AWS Transit Gateway introduces Internet Group Management Protocol (IGMP) support, making it easier to manage applications that use IP multicast.

Customers have previously used AWS Transit Gateway to run multicast applications in the cloud. Now with IGMP support, it’s easier to scale and manage multicast group membership. You no longer need to configure static multicast groups, sources and sinks, Transit Gateway automatically adds and removes group members using IGMP.

IGMP is an open standard and many multicast applications rely on it. It is now easier to migrate them to the cloud.

More details here

Russian-language Twitch session

All news in the field of analytics and network infrastructure will be discussed today in the Russian-language twitch stream. The leading AWS solution architects have chosen all the most interesting, have already tried a lot and will now exchange their impressions of the new products and answer all your questions. If you haven’t connected to streams yet – registration link… By the way, you can watch the recordings of previous Russian-language streams in the tweet, if you missed them.

Previous news from AWS re: Invent 2020:
AWS re: Invent. Day 1 Top Announcements (Andy Jassy, ​​Business Applications)
AWS re: Invent. Day 1 Main Announcements (Storage)
AWS re: Invent 2020 Keynotes – Machine Learning

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *