2 Clojisr tutorial
2.1 Setup
ns clojisr.v1.tutorials.main
(:require [clojisr.v1.r :as r :refer [r eval-r->java r->java java->r java->clj java->native-clj clj->java r->clj clj->r ->code r+ colon require-r]]
(:as robject]
[clojisr.v1.robject :as session]
[clojisr.v1.session :as dataset]
[tech.v3.dataset :as kind]
[scicloj.kindly.v4.kind :as kindly])) [scicloj.kindly.v4.api
2.2 Basic examples
Let us start by some basic usage examples of Clojisr.
First, let us make sure that we use the Rserve backend (in case we were using another engine instead earlier), and that there are no R sessions currently running. This is typically not needed if you just started working. Here, we do it just in case.
:rserve) (r/set-default-session-type!
:session-type :rserve} {
(r/discard-all-sessions)
{}
Now let us run some R code, and keep a Clojure handle to the return value.
def x (r "1+2")) (
The thing we created is something called an ROBject.
class x) (
clojisr.v1.robject.RObject
If we wish, we can convert an ROBject to Clojure:
(r->clj x)
3.0] [
Let us see more examples of creating ROBjects and converting them to Clojure:
->> "list(A=1,B=2,'#123strange<text> ()'=3)"
(
r r->clj)
:A [1.0], :B [2.0], "#123strange<text> ()" [3.0]} {
In the other direction, we can convert Clojure data to R data. Note that nil
is turned to NA
.
-> [1 nil 3]
( clj->r)
1] 1 NA 3
[
We can run code on a separate R session (specify session-args which are different than the default ones).
-> "1+2"
(:session-args {:session-name "mysession"})
(r r->clj)
3.0] [
2.3 Functions
An R function is also a Clojure function.
def f (r "function(x) x*10")) (
Let us apply it to Clojure data (implicitly converting that data to R).
-> 5
(
f r->clj)
50.0] [
We can also apply it to R data.
-> "5*5"
(
r
f r->clj)
250.0] [
Functions can get named arguments. Here we pass the na.rm
argument, that tells R whether to remove missing values whenn computing the mean.
"mean")
(r->clj ((r 1 nil 3]
[:na.rm true))
2.0] [
Another example:
let [f (r "function(w,x,y=10,z=20) w+x+y+z")]
(->> [(f 1 2)
(1 2 :y 100)
(f 1 2 :z 100)]
(f map r->clj))) (
33.0] [123.0] [113.0]) ([
Some functions are already created in Clojisr and given special names for convenience. Here are some examples:
R addition:
->> (r+ 1 2 3)
( r->clj)
6] [
R colon (:
), for creating a range of integers, like 0:9
:
0 9)) (r->clj (colon
0 1 2 3 4 5 6 7 8 9] [
2.4 R dataframes and tech.ml.dataset datasets
At Clojure, we have a structure that is equivalent to R dataframes: a tech.ml.dataset dataset.
Let us create such a dataset, pass it to an R function to compute the row means, and then convert the return value back to Clojure.
let [row-means (r "function(data) rowMeans(data)")]
(-> {:x [1 2 3]
(:y [4 5 6]}
dataset/->dataset
row-means r->clj))
2.5 3.5 4.5] [
Let us see some more dataset proccessing through R.
Loading the R package dplyr (assuming it is installed).
"library(dplyr)") (r
1] "dplyr" "Rserve" "stats" "graphics" "grDevices" "utils"
[7] "datasets" "methods" "base"
[
Using dplyr to process some Clojure dataset, and convert back to the resulting dataset.
let [filter-by-x (r "function(data) filter(data, x>=2)")
("function(data) mutate(data, z=x+y)")]
add-z-column (r -> {:x [1 2 3]
(:y [4 5 6]}
dataset/->dataset
filter-by-x
add-z-column r->clj))
_unnamed [2 3]:
:x | :y | :z |
---|---|---|
2 | 5 | 7 |
3 | 6 | 9 |
Tibbles, which are a more recent R dataframe notion, are also supported, as a special case of data frames.
"library(tibble)") (r
1] "tibble" "dplyr" "Rserve" "stats" "graphics" "grDevices"
[7] "utils" "datasets" "methods" "base"
[
let [tibble (r "tibble")]
(
(tibble:x [1 2 3]
:y [4 5 6]))
3 × 2
# A tibble:
x y
<int> <int>1 1 4
2 2 5
3 3 6
let [tibble (r "tibble")]
(-> (tibble
(:x [1 2 3]
:y [4 5 6])
r->clj dataset/mapseq-reader))
:x 1, :y 4} {:x 2, :y 5} {:x 3, :y 6}] [{
2.5 R objects
Clojisr holds handles to R objects, that are stored in memory at the R session, where they are assigned random names.
def one+two (r "1+2")) (
class one+two) (
clojisr.v1.robject.RObject
The name of an object is the place where it is held at R (inside an R evnironment called .MEM
).
:object-name one+two) (
".MEM$x7a9289ee40e04192"
2.6 Generating code
Let us see the mechanism by which clojisr generates R code, and the rules defining it.
Since we are playing a bit with the internals here, we will need a reference to the R session:
def session
(nil)) (session/fetch-or-make
For the following examples, we will use some dummy handles to R objects with given names:
def x (robject/->RObject "robject_x" session nil nil)) (
def y (robject/->RObject "robject_y" session nil nil)) (
.. and some real handles to R objects:
def minus-eleven (r "-11")) (
def abs (r "abs")) (
The function ->code
generates R code according to a certain set of rules. Here we describe some of these rules briefly. We also wrote a dedicated tutorial about the rule set more thoroughly.
For an ROBject, the generated code is just the ROBject name.
(->code x)
"robject_x"
For a clojure value, we use some form analysis and generate proper R string or values.
"hello") (->code
"\"hello\""
1 2 3]) (->code [
"c(1L,2L,3L)"
For a symbol, we generate the code with the corresponding R symbol.
'x) (->code
"x"
A sequential structure (list, vector, etc.) can be interpreted as a compound expression, for which code generation is defined accorting to the first list element.
For a list beginning with the symbol 'function
, we generate an R function definition.
(->code '(function [x y] x))
"function(x,y) {x}"
For a vector instead of list, we create R vector.
(->code '[function [x y] x])
"c(function,c(x,y),x)"
For a list beginning with the symbol 'formula
, we generate an R ~
-formula.
(->code '(formula x y))
"(x~y)"
For a list beginning with a symbol known to be a binary operator, we generate nested calls.
+ x y z)) (->code '(
"((x+y)+z)"
For a list beginning with another symbol, we generate a function call with that symbol as the function name.
(->code '(f x))
"f(x)"
For a list beginning with an R object that is a function, we generate a function call with that object as the function. If you create the list using the quote sign ('
), don’t forget to unquote symbols refering to things you defined on the Clojure side.
~abs x)) (->code '(
".MEM$x122c42652d2c44ea(x)"
All other sequential things (that is, those not beginning with a symbol or R function) are intepreted as data, converted implicitly R data representation.
~abs (1 2 3))) (->code `(
".MEM$x122c42652d2c44ea(c(1L,2L,3L))"
Some more examples, showing how these rules compose:
(->code '(function [x y] (f y)))
"function(x,y) {f(y)}"
~y))) (->code '(function [x y] (f
"function(x,y) {f(robject_y)}"
+ x y))) (->code '(function [x y] (
"function(x,y) {(x+y)}"
list 'function '[x y] (list '+ 'x 'y))) (->code (
"function(x,y) {(x+y)}"
print x) (f x))) (->code '(function [x y] (
"function(x,y) {print(x);f(x)}"
~abs x))) (->code '(function [x y] (
"function(x,y) {.MEM$x122c42652d2c44ea(x)}"
~abs ~minus-eleven)) (->code '(
".MEM$x122c42652d2c44ea(.MEM$x28987055e359432e)"
~abs -11)) (->code '(
".MEM$x122c42652d2c44ea(-11L)"
Use syntax quote ` in case you want to use local bindings.
let [minus-ten -10]
(~abs ~minus-ten))) (->code `(
".MEM$x122c42652d2c44ea(-10L)"
2.7 Running generated code
Clojure forms can be run as R code. Behind the scences, they are turned to R code using the ->code
function described above. For example:
-> '(~abs ~(range -3 0))
(
r r->clj)
3 2 1] [
Or, equivalently:
-> '(~abs ~(range -3 0))
(
->code
r r->clj)
3 2 1] [
Let us repeat the basic examples from the beginning of this tutorial, this time generating code rather than writing it as Strings.
def x (r '(+ 1 2))) (
(r->clj x)
3] [
def f (r '(function [x] (* x 10)))) (
-> 5
(
f r->clj)
50] [
-> "5*5"
(
r
f r->clj)
250.0] [
let [row-means (r '(function [data] (rowMeans data)))]
(-> {:x [1 2 3]
(:y [4 5 6]}
dataset/->dataset
row-means r->clj))
2.5 3.5 4.5] [
(r '(library dplyr))
1] "tibble" "dplyr" "Rserve" "stats" "graphics" "grDevices"
[7] "utils" "datasets" "methods" "base"
[
let [filter-by-x (r '(function [data] (filter data (>= x 2))))
(= z (+ x y)))))]
add-z-column (r '(function [data] (mutate data (->> {:x [1 2 3]
(:y [4 5 6]}
dataset/->dataset
filter-by-x
add-z-column r->clj))
_unnamed [2 3]:
:x | :y | :z |
---|---|---|
2 | 5 | 7 |
3 | 6 | 9 |
2.8 Requiring R packages
Sometimes, we want to bring to the Clojure world functions and data from R packages. Here, we try to follow the require-python syntax of libpython-clj (though currently in a less sophisticated way.)
:as statz :refer [median]]) (require-r '[stats
nil
-> [1 2 3]
(
r.stats/median
r->clj )
2] [
-> [1 2 3]
(
statz/median r->clj)
2] [
-> [1 2 3]
(
median r->clj)
2] [
:as datasetz :refer [euro]]) (require-r '[datasets
nil
[r.datasets/euro
datasetz/euro euro]
[ ATS BEF DEM ESP FIM FRF 13.760300 40.339900 1.955830 166.386000 5.945730 6.559570
IEP ITL LUF NLG PTE 0.787564 1936.270000 40.339900 2.203710 200.482000
ATS BEF DEM ESP FIM FRF 13.760300 40.339900 1.955830 166.386000 5.945730 6.559570
IEP ITL LUF NLG PTE 0.787564 1936.270000 40.339900 2.203710 200.482000
ATS BEF DEM ESP FIM FRF 13.760300 40.339900 1.955830 166.386000 5.945730 6.559570
IEP ITL LUF NLG PTE 0.787564 1936.270000 40.339900 2.203710 200.482000
]
:refer [$]]) (require-r '[base
nil
-> {:a 1 :b 2}
('a)
($ r->clj)
1] [
2.9 Data visualization
Functions creating R plots or any plotting objects generated by various R libraries can be wrapped in a way that returns an SVG, BufferedImage or can be saved to a file. All of them accept additional parameters specified in grDevices
R package.
Currently there is a bug that sometimes causes axes and labels to disappear when rendered inside a larger HTML.
:refer [plot hist]]) (require-r '[graphics
nil
:refer [ggplot aes geom_point xlab ylab labs]]) (require-r '[ggplot2
nil
require '[clojisr.v1.applications.plotting :refer [plot->svg plot->file plot->buffered-image]]) (
First example, simple plotting function as SVG string.
(plot->svgfn []
(->> rand
(repeatedly 30)
(+)
(reductions :xlab "t"
(plot :ylab "y"
:type "l"))))
ggplot2 plots (or any other plot objects like lattice) can be also turned into SVG.
(plot->svglet [x (repeatedly 99 rand)
(map +
y (
xrepeatedly 99 rand))]
(-> {:x x :y y}
(
dataset/->dataset:x x
(ggplot (aes :y y
:color '(+ x y)
:size '(/ x y)))
(r+ (geom_point)"x")
(xlab "y"))))) (ylab
Any plot (function or object) can be saved to file or converted to BufferedImage object.
let [path "/tmp/histogram.jpg"]
(path
(r->clj (plot->file fn [] (hist [1 1 1 1 2 3 4 5]
(:main "Histogram"
:xlab "data: [1 1 1 1 2 3 4 5]"))
:width 800 :height 400 :quality 50))
-> (clojure.java.shell/sh "ls" path)
(:out
kind/code))
/tmp/histogram.jpg
fn [] (hist [1 1 1 1 2 3 4 5])) :width 222 :height 149) (plot->buffered-image (
2.10 Intermediary representation as Java objects.
Clojisr relies on the fact of an intemediary representation of java, as Java objects. This is usually hidden from the user, but may be useful sometimes. In the current implementation, this is based on REngine.
import (org.rosuda.REngine REXP REXPInteger REXPDouble)) (
org.rosuda.REngine.REXPDouble
We can convert data between R and Java.
-> "1:9"
(
r
r->javaclass)
org.rosuda.REngine.REXPInteger
-> (REXPInteger. 1)
(
java->r r->clj)
1] [
We can further convert data from the java representation to Clojure.
-> "1:9"
(
r
r->java java->clj)
1 2 3 4 5 6 7 8 9] [
On the opposite direction, we can also convert Clojure data into the Java represenattion.
-> (range 1 10)
(
clj->javaclass)
org.rosuda.REngine.REXPInteger
-> (range 1 10)
(
clj->java java->clj)
1 2 3 4 5 6 7 8 9] [
There is an alternative way of conversion from Java to Clojure, naively converting the internal Java representation to a Clojure data structure. It can be handy when one wants to have plain access to all the metadata (R attributes), etc.
->> "1:9"
(
r
r->java java->native-clj)
1, 2, 3, 4, 5, 6, 7, 8, 9] [
->> "data.frame(x=1:3,y=factor('a','a','b'))"
(
r
r->java java->native-clj)
:x [1, 2, 3], :y [1, 1, 1]} {
We can evaluate R code and immediately return the result as a java object, without ever creating a handle to an R object holding the result:
-> "1+2"
(
eval-r->javaclass)
org.rosuda.REngine.REXPDouble
-> "1+2"
(
eval-r->java
(.asDoubles)vec)
3.0] [
2.11 More data conversion examples
Convertion between R and Clojure always passes through Java. To stress this, we write it explicitly in the following examples.
-> "list(a=1:2,b='hi!')"
(
r
r->java java->clj)
:a [1 2], :b ["hi!"]} {
Partially named lists are also supported
-> "list(a=1:2,'hi!')"
(
r
r->java java->clj)
:a [1 2], 1 ["hi!"]} {
-> "table(c('a','b','a','b','a','b','a','b'), c(1,1,2,2,3,3,1,1))"
(
r
r->java
java->clj
dataset/mapseq-readerset)
0 "a", 1 "2", :$value 2}
#{{0 "b", 1 "3", :$value 1}
{0 "a", 1 "1", :$value 2}
{0 "a", 1 "3", :$value 1}
{0 "b", 1 "2", :$value 1}
{0 "b", 1 "1", :$value 1}} {
-> {:a [1 2] :b "hi!"}
(
clj->java
java->r
r->java java->clj)
:a [1 2], :b ["hi!"]} {
->> {:a [1 2] :b "hi!"}
(
clj->java
java->r"deparse"))
((r
r->java java->clj)
"list(a = 1:2, b = \"hi!\")"] [
2.11.1 Basic types convertion clj->r->clj
def clj->r->clj (comp r->clj r)) (
nil) (clj->r->clj
nil
10 11]) (clj->r->clj [
10 11] [
10.0 11.0]) (clj->r->clj [
10.0 11.0] [
list 10.0 11.0)) (clj->r->clj (
10.0 11.0] [
:a 1 :b 2}) (clj->r->clj {
:a [1], :b [2]} {
2.11.2 Various R objects
Named list
-> "list(a=1L,b=c(10,20),c='hi!')"
(
r r->clj)
:a [1], :b [10.0 20.0], :c ["hi!"]} {
Array of doubles
-> "c(10,20,30)"
(
r r->clj)
10.0 20.0 30.0] [
Array of longs
-> "c(10L,20L,30L)"
(
r r->clj)
10 20 30] [
Timeseries
-> 'euro
(
r
r->cljfirst)
13.7603
Pairlist
-> r.stats/dnorm
(
r.base/formals
r->cljkeys
sort)
:log :mean :sd :x) (
NULL
-> "NULL"
(
r r->clj)
nil
TRUE/FALSE
-> "TRUE"
(
r r->clj)
true] [
2.12 Inspecting R functions
The mean
function is defined to expect arguments x
and ...
. These arguments have no default values (thus, its formals have empty symbols as values):
-> 'mean
(
r.base/formals r->clj)
:x , :... } {
It is an S3 generic function function, which we can realize by printing it:
'mean) (r
function (x, ...) "mean")
UseMethod(0x647b13f877a0>
<bytecode: :base>
<environment: namespace
So, we can expect possibly more details when inspecting its default implementation. Now, we see some arguments that do have default values.
-> 'mean.default
(
r.base/formals r->clj)
:x , :trim [0.0], :na.rm [false], :... } {
2.13 R-function-arglists
As we saw earlier, R functions are Clojure functions. The arglists of functions brought up by require-r
match the expected arguments. Here are some examples:
(require-r
'[base]
'[stats] '[grDevices])
nil
->> [#'r.base/mean, #'r.base/mean-default, #'r.stats/arima0,
(#'r.grDevices/dev-off, #'r.base/Sys-info, #'r.base/summary-default
;; Primitive functions:
#'r.base/sin, #'r.base/sum]
map (fn [f]
(-> f
(meta
update :ns (comp symbol str)))))) (
:arglists ([x & {:keys [...]}]), :name mean, :ns r.base}
({:arglists ([x & {:keys [trim na.rm ...]}]),
{:name mean-default,
:ns r.base}
:arglists
{
([x
&:keys
{
[order
seasonal
xreg
include.mean
delta
transform.pars
fixed
init
methodcond
n.
optim.control]}]),:name arima0,
:ns r.stats}
:arglists ([& {:keys [which]}]), :name dev-off, :ns r.grDevices}
{:arglists ([]), :name Sys-info, :ns r.base}
{:arglists ([object & {:keys [... digits quantile.type]}]),
{:name summary-default,
:ns r.base}
:arglists ([x]), :name sin, :ns r.base}
{:arglists ([& {:keys [... na.rm]}]), :name sum, :ns r.base}) {