parallel processing on more than one machine

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

parallel processing on more than one machine

Russ P.
I am doing parallel processing using the parallel Vector class on an HP Linux machine with 32 processor cores. It works well for my current purposes, but eventually my code has to run in real time, and it isn't fast enough yet.

I'm wondering if there is a way to parallelize over more than one machine without any major changes to my current Scala code. I have never used Akka or Spark, and I know very little about them. Can they somehow be used for that? Thanks.

--Russ

--
You received this message because you are subscribed to the Google Groups "scala-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: parallel processing on more than one machine

Oliver Ruebenacker

     Hello,

  If you want to use multiple machines, it means that, at least under the hood, you have multiple apps running on multiple JVMs. I'm afraid that means major changes to your code.

  The easiest is probably Spark. In Spark, you have a master app in which you have objects called Resilient Distributed Datasets (RDDs), with an API that is very similar to that of standard Scala collections with map and fold and so on. But under the hood, data is distributed over multiple worker nodes, and as you do operations like map or fold, data is processed on worker nodes and sent around as needed.

     Best, Oliver

On Thu, Mar 9, 2017 at 8:03 PM, Russ P. <[hidden email]> wrote:
I am doing parallel processing using the parallel Vector class on an HP Linux machine with 32 processor cores. It works well for my current purposes, but eventually my code has to run in real time, and it isn't fast enough yet.

I'm wondering if there is a way to parallelize over more than one machine without any major changes to my current Scala code. I have never used Akka or Spark, and I know very little about them. Can they somehow be used for that? Thanks.

--Russ

--
You received this message because you are subscribed to the Google Groups "scala-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.



--
Oliver Ruebenacker
Senior Software Engineer, Diabetes Portal, Broad Institute

--
You received this message because you are subscribed to the Google Groups "scala-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: parallel processing on more than one machine

Peter Wolf
In reply to this post by Russ P.
+1 Spark

I have used both Spark and Akka Actor for this sort of task.  If your problem is a simple giant Vector operation, Spark will make it very simple to scale up.  Try Spark local on your laptop to write and debug your code, then maybe rent a cluster from Amazon or equivalent to run it.

Peter

--
You received this message because you are subscribed to the Google Groups "scala-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: parallel processing on more than one machine

Russ P.
Thanks for the helpful replies. I will start with Spark on one machine and go from there.

--Russ

On Friday, March 10, 2017 at 9:02:51 AM UTC-8, Peter Wolf wrote:
+1 Spark

I have used both Spark and Akka Actor for this sort of task.  If your problem is a simple giant Vector operation, Spark will make it very simple to scale up.  Try Spark local on your laptop to write and debug your code, then maybe rent a cluster from Amazon or equivalent to run it.

Peter

--
You received this message because you are subscribed to the Google Groups "scala-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Loading...