Other

Spark

Blog Posts tagged Spark#

blog post

News from the sparkly-verse

Highlights to the most recent updates to `sparklyr` and friends

Edgar Ruiz

Apr 22, 2024

MLOps and Admin Data Wrangling Ai Packages Releases Spark

blog post

sparklyr 1.3: Higher-order Functions, Avro and Custom Serializers

Sparklyr 1.3 is now available, featuring integration of Spark higher-order functions, and data import/export in Avro and in user-defined serialization formats

Yitao Li

Jul 16, 2020

Data Wrangling Big Data Sparklyr Spark Packages RStudio

blog post

sparklyr 1.2: Foreach, Spark 3.0 and Databricks Connect

sparklyr 1.2: Foreach, Spark 3.0 and Databricks Connect

sparklyr 1.2: foreach parallel backend, Databricks Connect support, and Spark 3.0 compatibility

Yitao Li

May 6, 2020

Data Wrangling MLOps and Admin Big Data Sparklyr Spark Packages RStudio

blog post

sparklyr 1.1: Foundations, Books, Lakes and Barriers

sparklyr 1.1: Delta Lake support, Spark 3.0 preview, and barrier execution for deep learning

Javier Luraschi

Jan 29, 2020

Data Wrangling MLOps and Admin Sparklyr Big Data Spark Packages RStudio

blog post

sparklyr 1.0: Apache Arrow, XGBoost, Broom and TFRecords

sparklyr 1.0: Apache Arrow for faster data transfers, XGBoost models, broom integration, and TFRecords

Javier Luraschi

Mar 15, 2019

Machine Learning Data Wrangling Sparklyr Big Data Spark Arrow Packages RStudio

blog post

sparklyr 0.9: Streams and Kubernetes

sparklyr 0.9: Spark structured streams for real-time data processing and Kubernetes cluster support

Javier Luraschi

Oct 1, 2018

Data Wrangling MLOps and Admin Spark Sparklyr Streaming Big Data Packages RStudio

blog post

sparklyr 0.7: Spark Pipelines and Machine Learning

sparklyr 0.7: ML Pipelines API for building, tuning, and deploying machine learning workflows at scale

Kevin Kuo

Jan 29, 2018

Machine Learning MLOps and Admin Data Science Distributed Computing Spark Sparklyr Packages RStudio

blog post

sparklyr 0.6: Distributed R and external sources

sparklyr 0.6: distributed R with spark_apply() and external data source connections

Javier Luraschi

Jul 31, 2017

Data Wrangling MLOps and Admin Spark Sparklyr Distributed Computing Packages RStudio

blog post

sparklyr 0.5: Livy and dplyr improvements

sparklyr 0.5 extends dplyr with do() and n_distinct(), adds experimental Livy support for remote Spark connections

Javier Luraschi

Jan 24, 2017

Data Wrangling MLOps and Admin Livy Spark Sparklyr Packages RStudio

blog post

SparkR preview by Vincent Warmerdam

Guest tutorial: Get started with SparkR for distributed computing—install locally, run map/reduce operations, and deploy on AWS

Garrett Grolemund

May 28, 2015

MLOps and Admin Big Data Spark Sparkr RStudio Training

Resources tagged Spark#

video

Introducing an R interface for Apache Spark | RStudio Webinar - 2017

Introducing an R interface for Apache Spark | RStudio Webinar - 2017

This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/webinars/ . We try to host a couple each month with the goal of furthering the R community's understanding of R and RStudio's capabilities. We are always interested in receiving feedback, so please don't hesitate to comment or reach out with a personal message

Nov 29, 2017

46 min

3.5k views

rstudio webinars RStudio RStudio Webinar R Progamming Spark Apache Spark

video

SparklyR

Aug 7, 2017

1 min

1.4k views

RStudio Spark R Programming Sparklyr