Tallan's Technology Blog

Tallan's Top Technologists Share Their Thoughts on Today's Technology Challenges

Adding R Packages In Azure ML

Iu-Wei Sze

The Execute R Script module in Azure Machine Learning is incredibly useful for manipulating data in ways that other modules do not cover. Its functionality can be further expanded by adding R packages that are not included in Azure ML by default. We will first show you how to get a list of packages that are already in your workspace and then how to add additional packages.

Checking Which R Packages are in Your Workspace

Create a new experiment, and place the following R code in an “Execute R Script” module:

data.set <-data.frame(installed.packages());
maml.mapOutputPort("data.set");

Run the experiment. The output of the Execute R Script module will be a list of the available packages.

Adding R Packages

Before you can use the package in Azure ML, you need to set up the zip file structure in which ML expects the packages to appear. To do this, start by installing the packages that you need locally.

From the R command line, use

install.packages("PackageName")

Or any technique you usually use to install packages. Afterwards, you’ll need to locate the packages’ zip file. The path should be printed during installation if you install the package using the technique suggested above. It should be something like “C:\Users\{username}\AppData\Local\Temp\Rtmp{random letters}\downloaded_packages”.

Once you find the packages’ zip file, put them into another zip file. In our case, we put lda_1.4.2.zip inside lda_packages.zip.

Now that the packages are in the structure that Azure ML expects, upload the outer zip file to your desired workspace as a dataset. Once the zip file is finished uploading add these lines to import the packages inside the Execute R Script module

install.packages("src/{package's zip file name}.zip", lib = ".", repos = NULL, verbose = TRUE)
library({package name}, lib.loc=".", verbose=TRUE)

In our case, the first line would be install.packages(“src/lda_1.4.2.zip”, lib = “.”, repos = NULL, verbose = TRUE) and the second line would be library(lda, lib.loc=”.”, verbose=TRUE)

Now the zipped packages should appear as a module under saved datasets > my datasets. Drag this module into the experiment and connect its output to the third input of the Execute R Script module that will be using the packages.

From there, you should be able to use the package as you normally would in R.

Note: In one instance, we ran into an issue where the package was not loaded into the workspace immediately. We had to wait about half an hour before we could use the package. You may be running into this issue as well if you see a message that looks something like this

and you’ve used the above method to check which packages are in your workspace and the package in question appears on that list. If this is the case, we suggest waiting a bit before running your experiment again.

 

CLICK HERE to learn more about Microsoft Azure and how Tallan can help your organization enhance productivity, streamline costs, and provide optimal transparency into your day-to-day operations with the power of the cloud!

_________________________________________________________________________________________

Authored By:

Lu Li & Iu-Wei Sze

No comments

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>