(Quick Reference)

2 Getting Started - Reference Documentation

Authors: The Whole GPars Gang

Version: 1.0.0

2 Getting Started

Let's set out a few assumptions before we get started:
  1. You know and use Groovy and Java: otherwise you'd not be investing your valuable time studying a concurrency and parallelism library for Groovy and Java.
  2. You definitely want to write your codes employing concurrency and parallelism using Groovy and Java.
  3. If you are not using Groovy for your code, you are prepared to pay the inevitable verbosity tax of using Java.
  4. You target multi-core hardware with your code.
  5. You appreciate that in concurrent and parallel code things can happen at any time, in any order, and more likely with than one thing happening at once.

With those assumptions in place, we get started.

It's becoming more and more obvious that dealing with concurrency and parallelism at the thread/synchronized/lock level, as provided by the JVM, is far too low a level to be safe and comfortable. Many high-level concepts, such as actors and dataflow have been around for quite some time: parallel computers have been in use, at least in data centres if not on the desktop, long before multi-core chips hit the hardware mainstream. Now then is the time to adopt these higher-level abstractions in the mainstream software industry. This is what GPars enables for the Groovy and Java languages, allowing Groovy and Java programmers to use higher-level abstractions and therefore make developing concurrent and parallel software easier and less error prone.

The concepts available in GPars can be categorized into three groups:

  1. Code-level helpers Constructs that can be applied to small parts of the code-base such as individual algorithms or data structures without any major changes in the overall project architecture
    1. Parallel Collections
    2. Asynchronous Processing
    3. Fork/Join (Divide/Conquer)
  2. Architecture-level concepts Constructs that need to be taken into account when designing the project structure
    1. Actors
    2. Communicating Sequential Processes (CSP)
    3. Dataflow
    4. Data Parallelism
  3. Shared Mutable State Protection Although about 95% of current use of shared mutable state can be avoided using proper abstractions, good abstractions are still necessary for the remaining 5% use cases, when shared mutable state cannot be avoided
    1. Agents
    2. Software Transactional Memory (not fully implemented in GPars as yet)

2.1 Downloading and Installing

GPars is now distributed as standard with Groovy. So if you have a Groovy installation, you should have GPars already. The exact version of GPars you have will, of course, depend of which version of Groovy. If you don't already have GPars, and you do have Groovy, then perhaps you should upgrade your Groovy!

If you do not have a Groovy installation, but get Groovy by using dependencies or just having the groovy-all artifact, then you will need to get GPars. Also if you want to use a version of GPars different from the one with Groovy, or have an old GPars-less Groovy you cannot upgrade, you will need to get GPars. The ways of getting GPars are:

  • Download the artifact from a repository and add it and all the transitive dependencies manually.
  • Specify a dependency in Gradle, Maven, or Ivy (or Gant, or Ant) build files.
  • Use Grapes (especially useful for Groovy scripts).

If you're building a Grails or a Griffon application, you can use the appropriate plugins to fetch the jar files for you.

The GPars Artifact

As noted above GPars is now distributed as standard with Groovy. If however, you have to manage this dependency manually, the GPars artifact is in the main Maven repository and in the Codehaus main and snapshots repositories. The released versions are in the Maven and Codehaus main repositories, the current development version (SNAPSHOT) is in the Codehaus snapshots repository. To use from Gradle or Grapes use the specification:

"org.codehaus.gpars:gpars:1.0.0"
for the release version, and:
"org.codehaus.gpars:gpars:1.1-SNAPSHOT"
for the development version. You will likely need to add the Codehaus snapshots repository manually to the search list in this latter case. Using Maven the dependency is:
<dependency>
    <groupId>org.codehaus.gpars</groupId>
    <artifactId>gpars</artifactId>
    <version>1.0.0</version>
</dependency>
or version 1.1-SNAPSHOT if using the latest snapshot.

Transitive Dependencies

GPars as a library depends on Groovy version equal or greater than 1.8. Also, the Fork/Join concurrency library namely jsr166y (an artifact from the JSR-166 Project ) must be on the classpath the programs, which use GPars, to compile and execute. Released versions of this artifact are in the main Maven and Codehaus repositories. Development versions of the artifact are available in the Codehaus snapshots repository. Using Gradle or Grapes you would use the following dependency specification:

"org.codehaus.jsr166-mirror:jsr166y:1.7.0"
For Maven, the specification would be:
<dependency>
    <groupId>org.codehaus.jsr166-mirror</groupId>
    <artifactId>jsr166y</artifactId>
    <version>1.7.0</version>
</dependency>
The development versions have version number 1.7.0.1-SNAPSHOT.

GPars defines this dependency in its own descriptor, so ideally all dependency management should be taken care of automatically, if you use Gradle, Grails, Griffon, Maven, Ivy or other type of automatic dependency resolution tool.

Please visit the page Integration on the GPars website for more details.

2.2 A Hello World Example

Once you are setup, try the following Groovy script to test that your setup is functioning as it should.
import static groovyx.gpars.actor.Actors.actor

/** * A demo showing two cooperating actors. The decryptor decrypts received messages * and replies them back. The console actor sends a message to decrypt, prints out * the reply and terminates both actors. The main thread waits on both actors to * finish using the join() method to prevent premature exit, since both actors use * the default actor group, which uses a daemon thread pool. * @author Dierk Koenig, Vaclav Pech */

def decryptor = actor { loop { react { message -> if (message instanceof String) reply message.reverse() else stop() } } }

def console = actor { decryptor.send 'lellarap si yvoorG' react { println 'Decrypted message: ' + it decryptor.send false } }

[decryptor, console]*.join()

You should get a message "Decrypted message: Groovy is parallel" printed out on the console when you run the code.

GPars has been designed primarily for use with the Groovy programming language. Of course all Java and Groovy programs are just bytecodes running on the JVM, so GPars can be used with Java source. Despite being aimed at Groovy code use, the solid technical foundation, plus the good performance characteristics, of GPars make it an excellent library for Java programs. In fact most of GPars is written in Java, so there is no performance penalty for Java applications using GPars.

For details please refer to the Java API section.

To quick-test using GPars via the Java API, you can compile and run the following Java code:

import groovyx.gpars.MessagingRunnable;
import groovyx.gpars.actor.DynamicDispatchActor;

public class StatelessActorDemo { public static void main(String[] args) throws InterruptedException { final MyStatelessActor actor = new MyStatelessActor(); actor.start(); actor.send("Hello"); actor.sendAndWait(10); actor.sendAndContinue(10.0, new MessagingRunnable<String>() { @Override protected void doRun(final String s) { System.out.println("Received a reply " + s); } }); } }

class MyStatelessActor extends DynamicDispatchActor { public void onMessage(final String msg) { System.out.println("Received " + msg); replyIfExists("Thank you"); }

public void onMessage(final Integer msg) { System.out.println("Received a number " + msg); replyIfExists("Thank you"); }

public void onMessage(final Object msg) { System.out.println("Received an object " + msg); replyIfExists("Thank you"); } }

Remember though that you will almost certainly have to add the Groovy artifact to the build as well as the GPars artifact. GPars may well work at Java speeds with Java applications, but it still has some compilation dependencies on Groovy.

2.3 Code conventions

We follow certain conventions in the code samples. Understanding these may help you read and comprehend GPars code samples better.
  • The leftShift operator << has been overloaded on actors, agents and dataflow expressions (both variables and streams) to mean send a message or assign a value.

myActor << 'message'

myAgent << {account -> account.add('5 USD')}

myDataflowVariable << 120332

  • On actors and agents the default call() method has been also overloaded to mean send . So sending a message to an actor or agent may look like a regular method call.

myActor "message"

myAgent {house -> house.repair()}

  • The rightShift operator >> in GPars has the when bound meaning. So

myDataflowVariable >> {value -> doSomethingWith(value)}
will schedule the closure to run only after myDataflowVariable is bound to a value, with the value as a parameter.

In samples we tend to statically import frequently used factory methods:

  • GParsPool.withPool()
  • GParsPool.withExistingPool()
  • GParsExecutorsPool.withPool()
  • GParsExecutorsPool.withExistingPool()
  • Actors.actor()
  • Actors.reactor()
  • Actors.fairReactor()
  • Actors.messageHandler()
  • Actors.fairMessageHandler()
  • Agent.agent()
  • Agent.fairAgent()
  • Dataflow.task()
  • Dataflow.operator()

It is more a matter of style preferences and personal taste, but we think static imports make the code more compact and readable.

2.4 Getting Set Up in an IDE

Adding the GPars jar files to your project or defining the appropriate dependencies in pom.xml should be enough to get you started with GPars in your IDE.

GPars DSL recognition

IntelliJ IDEA in both the free Community Edition and the commercial Ultimate Edition will recognize the GPars domain specific languages, complete methods like eachParallel() , reduce() or callAsync() and validate them. GPars uses the GroovyDSL mechanism, which teaches IntelliJ IDEA the DSLs as soon as the GPars jar file is added to the project.

2.5 Applicability of Concepts

GPars provides a lot of concepts to pick from. We're continuously building and updating a page that tries to help user choose the right abstraction for their tasks at hands. Please, refer to the Concepts compared page for details.

To briefly summarize the suggestions, below you can find the basic guide-lines:

  1. You're looking at a collection, which needs to be iterated or processed using one of the many beautiful Groovy collections method, like each() , collect() , find() and such. Proposing that processing each element of the collection is independent of the other items, using GPars parallel collections can be recommended.
  2. If you have a long-lasting calculation , which may safely run in the background, use the asynchronous invocation support in GPars. Since the GPars asynchronous functions can be composed, you can quickly parallelize complex functional calculations without having to mark independent calculations explicitly.
  3. You need to parallelize an algorithm at hand. You can identify a set of tasks with their mutual dependencies. The tasks typically do not need to share data, but instead some tasks may need to wait for other tasks to finish before starting. You're ready to express these dependencies explicitly in code. With GPars dataflow tasks you create internally sequential tasks, each of which can run concurrently with the others. Dataflow variables and channels provide the tasks with the capability to express their dependencies and to exchange data safely.
  4. You can't avoid using shared mutable state in your algorithm. Multiple threads will be accessing shared data and (some of them) modifying it. Traditional locking and synchronized approach feels too risky or unfamiliar. Go for agents, which will wrap your data and serialize all access to it.
  5. You're building a system with high concurrency demands. Tweaking a data structure here or task there won't cut it. You need to build the architecture from the ground up with concurrency in mind. Message-passing might be the way to go.
    1. Groovy CSP will give you highly deterministic and composable model for concurrent processes. The model is organized around the concept of calculations or processes, which run concurrently and communicate through synchronous channels.
    2. If you're trying to solve a complex data-processing problem, consider GPars dataflow operators to build a data flow network. The concept is organized around event-driven transformations wired into pipelines using asynchronous channels.
    3. Actors and Active Objects will shine if you need to build a general-purpose, highly concurrent and scalable architecture following the object-oriented paradigm.

Now you may have a better idea of what concepts to use on your current project. Go and check out more details on them in the User Guide.

2.6 What's New

The new GPars 1.0.0 release introduces a lot of gradual enhancements and improvements on top of the previous release, mainly in the dataflow area.

Check out the JIRA release notes

Project changes

See the Breaking Changes listing for the list of breaking changes.

Asynchronous functions

  • Allowed for delayed and explicit thread pool assignment strategies for asynchronous functions
  • Performance tuning to the asynchronous closure invocation mechanism

Parallel collections

  • Added a couple of new parallel collection processing methods to keep up with the innovation pace in Groovy
  • Merged the extra166y library into GPars

Fork / Join

Actors

  • StaticDispatchActor has been added to provide easier to create and better performing alternative to DynamicDispatchActor
  • A new method sendAndPromise has been added to actors to send a message and get a promise for the future actor's reply

Dataflow

  • Operator and selector speed-up
  • Kanban-style dataflow operator management has been added
  • Chaining of Promises using the new then() method
  • Exception propagation and handling for Promises
  • Added a DSL for easy operator pipe-lining
  • Lifecycle events for operators and selectors were added
  • Added support for custom error handlers
  • A generic way to shutdown dataflow networks
  • An shutdown poison pill with immediate or delayed effect was added
  • Polished the way operators can be stopped
  • Added synchronous dataflow variables and channels
  • Read channels can report their length

Agent

Stm

Other

  • Removed deprecated classes and methods
  • Added numerous code examples and demos
  • Enhanced project documentation
  • Re-styled the user guide

Renaming hints

  • The makeTransparent() method that forces concurrent semantics to iteration methods (each, collect, find, etc.) has been removed
  • The stop() method on dataflow operators and selectors has been renamed to terminate() to match naming used for actor
  • The reportError() method on dataflow operators and selectors has been replaced with the addErrorHandler() method
  • The RightShift (>>) operator of DataflowVariables and channels now calls then() instead of whenBound() and so can be chained

2.7 Java API - Using GPars from Java

Using GPars is very addictive, I guarantee. Once you get hooked you won't be able to code without it. May the world force you to write code in Java, you will still be able to benefit from most of GPars features.

Java API specifics

Some parts of GPars are irrelevant in Java and it is better to use the underlying Java libraries directly:

  • Parallel Collection - use jsr-166y library's Parallel Array directly
  • Fork/Join - use jsr-166y library's Fork/Join support directly
  • Asynchronous functions - use Java executor services directly

The other parts of GPars can be used from Java just like from Groovy, although most will miss the Groovy DSL capabilities.

GPars Closures in Java API

To overcome the lack of closures as a language element in Java and to avoid forcing users to use Groovy closures directly through the Java API, a few handy wrapper classes have been provided to help you define callbacks, actor body or dataflow tasks.

  • groovyx.gpars.MessagingRunnable - used for single-argument callbacks or actor body
  • groovyx.gpars.ReactorMessagingRunnable - used for ReactiveActor body
  • groovyx.gpars.DataflowMessagingRunnable - used for dataflow operators' body

These classes can be used in all places GPars API expects a Groovy closure.

Actors

The DynamicDispatchActor as well as the ReactiveActor classes can be used just like in Groovy:

import groovyx.gpars.MessagingRunnable;
 import groovyx.gpars.actor.DynamicDispatchActor;

public class StatelessActorDemo { public static void main(String[] args) throws InterruptedException { final MyStatelessActor actor = new MyStatelessActor(); actor.start(); actor.send("Hello"); actor.sendAndWait(10); actor.sendAndContinue(10.0, new MessagingRunnable<String>() { @Override protected void doRun(final String s) { System.out.println("Received a reply " + s); } }); } }

class MyStatelessActor extends DynamicDispatchActor { public void onMessage(final String msg) { System.out.println("Received " + msg); replyIfExists("Thank you"); }

public void onMessage(final Integer msg) { System.out.println("Received a number " + msg); replyIfExists("Thank you"); }

public void onMessage(final Object msg) { System.out.println("Received an object " + msg); replyIfExists("Thank you"); } }

Although there are not many differences between Groovy and Java GPars use, notice, the callbacks instantiating the MessagingRunnable class in place for a groovy closure.

import groovy.lang.Closure;
import groovyx.gpars.ReactorMessagingRunnable;
import groovyx.gpars.actor.Actor;
import groovyx.gpars.actor.ReactiveActor;

public class ReactorDemo { public static void main(final String[] args) throws InterruptedException { final Closure handler = new ReactorMessagingRunnable<Integer, Integer>() { @Override protected Integer doRun(final Integer integer) { return integer * 2; } }; final Actor actor = new ReactiveActor(handler); actor.start();

System.out.println("Result: " + actor.sendAndWait(1)); System.out.println("Result: " + actor.sendAndWait(2)); System.out.println("Result: " + actor.sendAndWait(3)); } }

Convenience factory methods

Obviously, all the essential factory methods to build actors quickly are available where you'd expect them.

import groovy.lang.Closure;
import groovyx.gpars.ReactorMessagingRunnable;
import groovyx.gpars.actor.Actor;
import groovyx.gpars.actor.Actors;

public class ReactorDemo { public static void main(final String[] args) throws InterruptedException { final Closure handler = new ReactorMessagingRunnable<Integer, Integer>() { @Override protected Integer doRun(final Integer integer) { return integer * 2; } }; final Actor actor = Actors.reactor(handler);

System.out.println("Result: " + actor.sendAndWait(1)); System.out.println("Result: " + actor.sendAndWait(2)); System.out.println("Result: " + actor.sendAndWait(3)); } }

Agents

import groovyx.gpars.MessagingRunnable;
 import groovyx.gpars.agent.Agent;

public class AgentDemo { public static void main(final String[] args) throws InterruptedException { final Agent counter = new Agent<Integer>(0); counter.send(10); System.out.println("Current value: " + counter.getVal()); counter.send(new MessagingRunnable<Integer>() { @Override protected void doRun(final Integer integer) { counter.updateValue(integer + 1); } }); System.out.println("Current value: " + counter.getVal()); } }

Dataflow Concurrency

Both DataflowVariables and DataflowQueues can be used from Java without any hiccups. Just avoid the handy overloaded operators and go straight to the methods, like bind , whenBound , getVal and other. You may also continue using dataflow tasks passing to them instances of Runnable or Callable just like groovy Closure .

import groovyx.gpars.MessagingRunnable;
import groovyx.gpars.dataflow.DataflowVariable;
import groovyx.gpars.group.DefaultPGroup;

import java.util.concurrent.Callable;

public class DataflowTaskDemo { public static void main(final String[] args) throws InterruptedException { final DefaultPGroup group = new DefaultPGroup(10);

final DataflowVariable a = new DataflowVariable();

group.task(new Runnable() { public void run() { a.bind(10); } });

final DataflowVariable result = group.task(new Callable() { public Object call() throws Exception { return (Integer)a.getVal() + 10; } });

result.whenBound(new MessagingRunnable<Integer>() { @Override protected void doRun(final Integer integer) { System.out.println("arguments = " + integer); } });

System.out.println("result = " + result.getVal()); } }

Dataflow operators

The sample below should illustrate the main differences between Groovy and Java API for dataflow operators.

  1. Use the convenience factory methods accepting list of channels to create operators or selectors
  2. Use DataflowMessagingRunnable to specify the operator body
  3. Call getOwningProcessor() to get hold of the operator from within the body in order to e.g. bind output values

import groovyx.gpars.DataflowMessagingRunnable;
import groovyx.gpars.dataflow.Dataflow;
import groovyx.gpars.dataflow.DataflowQueue;
import groovyx.gpars.dataflow.operator.DataflowProcessor;

import java.util.Arrays; import java.util.List;

public class DataflowOperatorDemo { public static void main(final String[] args) throws InterruptedException { final DataflowQueue stream1 = new DataflowQueue(); final DataflowQueue stream2 = new DataflowQueue(); final DataflowQueue stream3 = new DataflowQueue(); final DataflowQueue stream4 = new DataflowQueue();

final DataflowProcessor op1 = Dataflow.selector(Arrays.asList(stream1), Arrays.asList(stream2), new DataflowMessagingRunnable(1) { @Override protected void doRun(final Object… objects) { getOwningProcessor().bindOutput(2*(Integer)objects[0]); } });

final List secondOperatorInput = Arrays.asList(stream2, stream3);

final DataflowProcessor op2 = Dataflow.operator(secondOperatorInput, Arrays.asList(stream4), new DataflowMessagingRunnable(2) { @Override protected void doRun(final Object… objects) { getOwningProcessor().bindOutput((Integer) objects[0] + (Integer) objects[1]); } });

stream1.bind(1); stream1.bind(2); stream1.bind(3); stream3.bind(100); stream3.bind(100); stream3.bind(100); System.out.println("Result: " + stream4.getVal()); System.out.println("Result: " + stream4.getVal()); System.out.println("Result: " + stream4.getVal()); op1.stop(); op2.stop(); } }

Performance

In general, GPars overhead is identical irrespective of whether you use it from Groovy or Java and tends to be very low. GPars actors, for example, can compete head-to-head with other JVM actor options, like Scala actors.

Since Groovy code in general runs slower than Java code, mainly due to dynamic method invocation, you might consider writing your code in Java to improve performance. Typically numeric operations or frequent fine-grained method calls within a task or actor body may benefit from a rewrite into Java.

Prerequisites

All the GPars integration rules apply to Java projects just like they do to Groovy projects. You only need to include the groovy distribution jar file in your project and all is clear to march ahead. You may also want to check out the sample Java Maven project to get tips on how to integrate GPars into a maven-based pure Java application - Sample Java Maven Project