开源项目:(169)

    项目名称:Apache Falcon
    项目描述:apache|:|Apache Falcon is a data processing and management solution for Hadoop designed for data motion, coordination of data pipelines, lifecycle management, and data discovery. Falcon enables end consumers to quickly onboard their data and its associated...

    项目名称:Apache Hadoop
    项目描述:apache|:|Hadoop is a distributed computing platform. This includes the Hadoop Distributed Filesystem (HDFS) and an implementation of MapReduce.|,|openhub|:|Hadoop is a framework for running applications on large clusters of commodity hardware. The Hadoop f...

    项目名称:Apache Hive
    项目描述:apache|:|The Apache Hive (TM) data warehouse software facilitates querying and managing large datasets residing in distributed storage. Built on top of Apache Hadoop (TM), it provides * tools to enable easy data extract/transform/load (ETL) * a mechanism t...

    项目名称:Apache Mahout
    项目描述:apache|:|Scalable machine learning library|,|openhub|:|Apache Mahout's goal is to build scalable machine learning libraries. With scalable we mean: Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based co...

    项目名称:Apache Giraph
    项目描述:apache|:|Apache Giraph is an iterative graph processing system built for high scalability. For example, it is currently used at Facebook to analyze the social graph formed by users and their connections.|,|oschina|:|Apache Giraph 是一个可伸缩的分布式迭代图处理系统,灵感来自 BSP...