Install


Download a pre-built library

You can download the Analytics Zoo release and nightly build from the Release Page


Currently, Analytics Zoo releases are hosted on maven central; here's an example to add the Analytics Zoo dependency to your own project:

<dependency>
    <groupId>com.intel.analytics.zoo</groupId>
    <artifactId>analytics-zoo-bigdl_0.10.0-[spark_2.1.1|spark_2.2.0|spark_2.3.1|spark_2.4.3]</artifactId>
    <version>${ANALYTICS_ZOO_VERSION}</version>
</dependency>

SBT developers can use

libraryDependencies += "com.intel.analytics.zoo" % "analytics-zoo-bigdl_0.10.0-[spark_2.1.1|spark_2.2.0|spark_2.3.1|spark_2.4.3]" % "${ANALYTICS_ZOO_VERSION}"

Remarks:


Currently, Analytics Zoo development version is hosted on SonaType.

To link your application with the latest Analytics Zoo development version, you should add some dependencies like Linking with Analytics Zoo releases, but set ${ANALYTICS_ZOO_VERSION} to latest version, and add below repository to your pom.xml.

<repository>
    <id>sonatype</id>
    <name>sonatype repository</name>
    <url>https://oss.sonatype.org/content/groups/public/</url>
    <releases>
        <enabled>true</enabled>
    </releases>
    <snapshots>
        <enabled>true</enabled>
    </snapshots>
</repository>

SBT developers can use

resolvers += "ossrh repository" at "https://oss.sonatype.org/content/repositories/snapshots/"

Download Analytics Zoo Source

Analytics Zoo source code is available at GitHub

$ git clone https://github.com/intel-analytics/analytics-zoo.git

By default, git clone will download the development version of Analytics Zoo, if you want a release version, you can use command git checkout to change the version.

Setup Build Environment

The following instructions are aligned with master code.

Maven 3 is needed to build Analytics Zoo, you can download it from the maven website.

After installing Maven 3, please set the environment variable MAVEN_OPTS as follows:

$ export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=512m"

When compiling with Java 7, you need to add the option ā€œ-XX:MaxPermSize=1Gā€.

It is highly recommended that you build Analytics Zoo using the make-dist.sh script. And it will handle the MAVEN_OPTS variable.

Once downloaded, you can build Analytics Zoo with the following commands:

$ bash make-dist.sh

After that, you can find a dist folder, which contains all the needed files to run a Analytics Zoo program. The files in dist include:

The instructions above will build Analytics Zoo with Spark 2.0(using Scala 2.11). It is highly recommended to use Java 8 when running with Spark 2.x; otherwise you may observe very poor performance.

Build for Spark 1.6

To build for Spark 1.6(which uses Scala 2.10 by default), pass -P spark_1.6 to the make-dist.sh script:

$ bash make-dist.sh -P spark_1.6

Build for Scala 2.10 or 2.11

By default, make-dist.sh uses Scala 2.11 for Spark 2.1, and Scala 2.10 for Spark 1.6. To override the default behaviors, you can pass -P scala_2.10 or -P scala_2.11 to make-dist.sh as appropriate.


Build with Maven

To build Analytics Zoo directly using Maven, run the command below:

$ mvn clean package -DskipTests

After that, you can find that jar packages in PATH_TO_ANALYTICS_ZOO/target/, where PATH_TO_ANALYTICS_ZOO is the path to the directory of the Analytics Zoo.

Note that the instructions above will build Analytics Zoo with Spark 2.0 (using Scala 2.11) for Linux. Similarly, you may customize the default behaviors by passing the following parameters to maven:


Setup IDE

We set the scope of spark related library to provided in pom.xml. The reason is that we don't want package spark related jars which will make analytics zoo a huge jar, and generally as analytics zoo is invoked by spark-submit, these dependencies will be provided by spark at run-time.

This will cause a problem in IDE. When you run applications, it will throw NoClassDefFoundError because the library scope is provided.

You can easily change the scopes by the all-in-one profile.