• Data Integration
  • Data streaming
  • Learn how to build a data lake → Introduction Big data beginnings New big data approaches Big data challenges
  • The following are some of the data tools




    Download 2,02 Mb.
    Pdf ko'rish
    bet4/7
    Sana21.05.2024
    Hajmi2,02 Mb.
    #248011
    1   2   3   4   5   6   7
    Bog'liq
    050623-The Future of Big Data with Data Lakehouse

    The following are some of the data tools 
    that many cloud providers offer their users:
    Object Storage
    Enables organizations to store any type 
    of data in its native format—this is ideal 
    for building modern applications that 
    require scale and flexibility
    Data Integration
    Easy-to-use tools that connect to 
    public and private data sources such as 
    databases and applications and reliably 
    transfer and synchronize the data to the 
    datastores in the data lake 
    Data Preparation
    Visual tools to create data 
    transformations between the source 
    and the target
    Data catalog
    An inventory of enterprisewide data 
    assets to help search, explore, and 
    govern data in the data lake
    Data streaming
    Lets organizations process data in 
    real time, enabling resilient stream 
    processing operations such as filters, 
    joins, maps, aggregations, and
    other transformations
    Data management
    Hadoop, Spark, databases, and query 
    tools that help organizations manage 
    data across all stores in the data lake 
    Analytics
    Tools to help organizations understand 
    and discover trends in their data and 
    use them to guide decision-making
    Using those tools, companies can 
    start data lakes for their unstructured 
    data on a small scale and continually 
    expand them with new data types, data 
    sources, and applications to derive 
    value from the data. 
    Learn how to build a data lake 

    Introduction
    Big data beginnings
    New big data approaches 
    Big data challenges
     
    Data lakes
    Data platforms
    AI and ML
    Business Use Cases
    Conclusion
    Data lakes

    The emergence of public clouds had a profound impact on the way 
    organizations could tackle big data challenges. The availability of cheap, 
    reliable, and infinitely scalable storage let companies ingest and store the data 
    raw and unchanged, instead of cleaning, transforming, and aggregating it 
    before storage. That, in turn, enabled new methods of analyzing the data that 
    previously weren’t available.
    James Dixon, then chief technology officer at Pentaho, coined the term “data 
    lake” for this new approach. Rather than creating isolated data warehouses, a 
    data lake promised to be a single repository for all of a company’s information. 
    Data lakes
    can be built with 
    Hadoop technologies or with 
    object storage and managed 
    data services provided by a 
    cloud provider. By delegating 
    the infrastructure work and 
    applications management to 
    a cloud provider, companies 
    can decrease the IT work of 
    big data tasks and focus on 
    data management.

    Download 2,02 Mb.
    1   2   3   4   5   6   7




    Download 2,02 Mb.
    Pdf ko'rish