4  Dataset transfer from R to Clojure

(ns clojisr.v1.tutorials.dataset
  (:require [clojisr.v1.r :as r :refer [r r->clj clj->r require-r]]
            [scicloj.kindly.v4.kind :as kind]
            [scicloj.kindly.v4.api :as kindly]))
(r/set-default-session-type! :rserve)
{:session-type :rserve}
(r/discard-all-sessions)
{}
(require-r '[datasets])
nil

4.1 Data Frame

Any data.frame, also tribble and data.table are treated the same. If row.names are available they are converted to the additional column :$row.names.

r.datasets/BOD
  Time demand
1    1    8.3
2    2   10.3
3    3   19.0
4    4   16.0
5    5   15.6
6    7   19.8
(r->clj '(attributes BOD))
{:names ["Time" "demand"],
 :class ["data.frame"],
 :row.names [1 2 3 4 5 6],
 :reference ["A1.4, p. 270"]}
(r->clj r.datasets/BOD)

_unnamed [6 2]:

:Time :demand
1.0 8.3
2.0 10.3
3.0 19.0
4.0 16.0
5.0 15.6
7.0 19.8
r.datasets/CO2
   Plant        Type  Treatment conc uptake
1    Qn1      Quebec nonchilled   95   16.0
2    Qn1      Quebec nonchilled  175   30.4
3    Qn1      Quebec nonchilled  250   34.8
4    Qn1      Quebec nonchilled  350   37.2
5    Qn1      Quebec nonchilled  500   35.3
6    Qn1      Quebec nonchilled  675   39.2
7    Qn1      Quebec nonchilled 1000   39.7
8    Qn2      Quebec nonchilled   95   13.6
9    Qn2      Quebec nonchilled  175   27.3
10   Qn2      Quebec nonchilled  250   37.1
11   Qn2      Quebec nonchilled  350   41.8
12   Qn2      Quebec nonchilled  500   40.6
13   Qn2      Quebec nonchilled  675   41.4
14   Qn2      Quebec nonchilled 1000   44.3
15   Qn3      Quebec nonchilled   95   16.2
16   Qn3      Quebec nonchilled  175   32.4
17   Qn3      Quebec nonchilled  250   40.3
18   Qn3      Quebec nonchilled  350   42.1
19   Qn3      Quebec nonchilled  500   42.9
20   Qn3      Quebec nonchilled  675   43.9
21   Qn3      Quebec nonchilled 1000   45.5
22   Qc1      Quebec    chilled   95   14.2
23   Qc1      Quebec    chilled  175   24.1
24   Qc1      Quebec    chilled  250   30.3
25   Qc1      Quebec    chilled  350   34.6
26   Qc1      Quebec    chilled  500   32.5
27   Qc1      Quebec    chilled  675   35.4
28   Qc1      Quebec    chilled 1000   38.7
29   Qc2      Quebec    chilled   95    9.3
30   Qc2      Quebec    chilled  175   27.3
31   Qc2      Quebec    chilled  250   35.0
32   Qc2      Quebec    chilled  350   38.8
33   Qc2      Quebec    chilled  500   38.6
34   Qc2      Quebec    chilled  675   37.5
35   Qc2      Quebec    chilled 1000   42.4
36   Qc3      Quebec    chilled   95   15.1
37   Qc3      Quebec    chilled  175   21.0
38   Qc3      Quebec    chilled  250   38.1
39   Qc3      Quebec    chilled  350   34.0
40   Qc3      Quebec    chilled  500   38.9
41   Qc3      Quebec    chilled  675   39.6
42   Qc3      Quebec    chilled 1000   41.4
43   Mn1 Mississippi nonchilled   95   10.6
44   Mn1 Mississippi nonchilled  175   19.2
45   Mn1 Mississippi nonchilled  250   26.2
46   Mn1 Mississippi nonchilled  350   30.0
47   Mn1 Mississippi nonchilled  500   30.9
48   Mn1 Mississippi nonchilled  675   32.4
49   Mn1 Mississippi nonchilled 1000   35.5
50   Mn2 Mississippi nonchilled   95   12.0
51   Mn2 Mississippi nonchilled  175   22.0
52   Mn2 Mississippi nonchilled  250   30.6
53   Mn2 Mississippi nonchilled  350   31.8
54   Mn2 Mississippi nonchilled  500   32.4
55   Mn2 Mississippi nonchilled  675   31.1
56   Mn2 Mississippi nonchilled 1000   31.5
57   Mn3 Mississippi nonchilled   95   11.3
58   Mn3 Mississippi nonchilled  175   19.4
59   Mn3 Mississippi nonchilled  250   25.8
60   Mn3 Mississippi nonchilled  350   27.9
61   Mn3 Mississippi nonchilled  500   28.5
62   Mn3 Mississippi nonchilled  675   28.1
63   Mn3 Mississippi nonchilled 1000   27.8
64   Mc1 Mississippi    chilled   95   10.5
65   Mc1 Mississippi    chilled  175   14.9
66   Mc1 Mississippi    chilled  250   18.1
67   Mc1 Mississippi    chilled  350   18.9
68   Mc1 Mississippi    chilled  500   19.5
69   Mc1 Mississippi    chilled  675   22.2
70   Mc1 Mississippi    chilled 1000   21.9
71   Mc2 Mississippi    chilled   95    7.7
72   Mc2 Mississippi    chilled  175   11.4
73   Mc2 Mississippi    chilled  250   12.3
74   Mc2 Mississippi    chilled  350   13.0
75   Mc2 Mississippi    chilled  500   12.5
76   Mc2 Mississippi    chilled  675   13.7
77   Mc2 Mississippi    chilled 1000   14.4
78   Mc3 Mississippi    chilled   95   10.6
79   Mc3 Mississippi    chilled  175   18.0
80   Mc3 Mississippi    chilled  250   17.9
81   Mc3 Mississippi    chilled  350   17.9
82   Mc3 Mississippi    chilled  500   17.9
83   Mc3 Mississippi    chilled  675   18.9
84   Mc3 Mississippi    chilled 1000   19.9
(r->clj '(attributes CO2))
{:names ["Plant" "Type" "Treatment" "conc" "uptake"],
 :row.names
 [1
  2
  3
  4
  5
  6
  7
  8
  9
  10
  11
  12
  13
  14
  15
  16
  17
  18
  19
  20
  21
  22
  23
  24
  25
  26
  27
  28
  29
  30
  31
  32
  33
  34
  35
  36
  37
  38
  39
  40
  41
  42
  43
  44
  45
  46
  47
  48
  49
  50
  51
  52
  53
  54
  55
  56
  57
  58
  59
  60
  61
  62
  63
  64
  65
  66
  67
  68
  69
  70
  71
  72
  73
  74
  75
  76
  77
  78
  79
  80
  81
  82
  83
  84],
 :class ["nfnGroupedData" "nfGroupedData" "groupedData" "data.frame"],
 :formula [~ uptake [| conc Plant]],
 :outer [~ [* Treatment Type]],
 :labels
 {:x ["Ambient carbon dioxide concentration"], :y ["CO2 uptake rate"]},
 :units {:x ["(uL/L)"], :y ["(umol/m^2 s)"]}}
(r->clj r.datasets/CO2)

_unnamed [84 6]:

:$row.names :Plant :Type :Treatment :conc :uptake
1 :Qn1 :Quebec :nonchilled 95.0 16.0
2 :Qn1 :Quebec :nonchilled 175.0 30.4
3 :Qn1 :Quebec :nonchilled 250.0 34.8
4 :Qn1 :Quebec :nonchilled 350.0 37.2
5 :Qn1 :Quebec :nonchilled 500.0 35.3
6 :Qn1 :Quebec :nonchilled 675.0 39.2
7 :Qn1 :Quebec :nonchilled 1000.0 39.7
8 :Qn2 :Quebec :nonchilled 95.0 13.6
9 :Qn2 :Quebec :nonchilled 175.0 27.3
10 :Qn2 :Quebec :nonchilled 250.0 37.1
74 :Mc2 :Mississippi :chilled 350.0 13.0
75 :Mc2 :Mississippi :chilled 500.0 12.5
76 :Mc2 :Mississippi :chilled 675.0 13.7
77 :Mc2 :Mississippi :chilled 1000.0 14.4
78 :Mc3 :Mississippi :chilled 95.0 10.6
79 :Mc3 :Mississippi :chilled 175.0 18.0
80 :Mc3 :Mississippi :chilled 250.0 17.9
81 :Mc3 :Mississippi :chilled 350.0 17.9
82 :Mc3 :Mississippi :chilled 500.0 17.9
83 :Mc3 :Mississippi :chilled 675.0 18.9
84 :Mc3 :Mississippi :chilled 1000.0 19.9

4.2 Table

Table is converted to a long form where each dimension has it’s own column. If column names are not available, column id is prefixed with :$col. Values are stored in the last, :$value column.

r.datasets/UCBAdmissions
, , Dept = A

          Gender
Admit      Male Female
  Admitted  512     89
  Rejected  313     19

, , Dept = B

          Gender
Admit      Male Female
  Admitted  353     17
  Rejected  207      8

, , Dept = C

          Gender
Admit      Male Female
  Admitted  120    202
  Rejected  205    391

, , Dept = D

          Gender
Admit      Male Female
  Admitted  138    131
  Rejected  279    244

, , Dept = E

          Gender
Admit      Male Female
  Admitted   53     94
  Rejected  138    299

, , Dept = F

          Gender
Admit      Male Female
  Admitted   22     24
  Rejected  351    317

(r->clj '(attributes UCBAdmissions))
{:dim [2 2 6],
 :dimnames
 {:Admit ["Admitted" "Rejected"],
  :Gender ["Male" "Female"],
  :Dept ["A" "B" "C" "D" "E" "F"]},
 :class ["table"]}
(r->clj r.datasets/UCBAdmissions)

_unnamed [24 4]:

:Admit :Gender :Dept :$value
Admitted Male A 512.0
Admitted Male B 313.0
Admitted Male C 89.0
Admitted Male D 19.0
Admitted Male E 353.0
Admitted Male F 207.0
Admitted Female A 17.0
Admitted Female B 8.0
Admitted Female C 120.0
Admitted Female D 205.0
Rejected Male B 279.0
Rejected Male C 131.0
Rejected Male D 244.0
Rejected Male E 53.0
Rejected Male F 138.0
Rejected Female A 94.0
Rejected Female B 299.0
Rejected Female C 22.0
Rejected Female D 351.0
Rejected Female E 24.0
Rejected Female F 317.0
r.datasets/crimtab
     142.24 144.78 147.32 149.86 152.4 154.94 157.48 160.02 162.56 165.1 167.64
9.4       0      0      0      0     0      0      0      0      0     0      0
9.5       0      0      0      0     0      1      0      0      0     0      0
9.6       0      0      0      0     0      0      0      0      0     0      0
9.7       0      0      0      0     0      0      0      0      0     0      0
9.8       0      0      0      0     0      0      1      0      0     0      0
9.9       0      0      1      0     1      0      1      0      0     0      0
10        1      0      0      1     2      0      2      0      0     1      0
10.1      0      0      0      1     3      1      0      1      1     0      0
10.2      0      0      2      2     2      1      0      2      0     1      0
10.3      0      1      1      3     2      2      3      5      0     0      0
10.4      0      0      1      1     2      3      3      4      3     3      0
10.5      0      0      0      1     3      7      6      4      3     1      3
10.6      0      0      0      1     4      5      9     14      6     3      1
10.7      0      0      1      2     4      9     14     16     15     7      3
10.8      0      0      0      2     5      6     14     27     10     7      1
10.9      0      0      0      0     2      6     14     24     27    14     10
11        0      0      0      2     6     12     15     31     37    27     17
11.1      0      0      0      3     3     12     22     26     24    26     24
11.2      0      0      0      3     2      7     21     30     38    29     27
11.3      0      0      0      1     0      5     10     24     26    39     26
11.4      0      0      0      0     3      4      9     29     56    58     26
11.5      0      0      0      0     0      5     11     17     33    57     38
11.6      0      0      0      0     2      1      4     13     37    39     48
11.7      0      0      0      0     0      2      9     17     30    37     48
11.8      0      0      0      0     1      0      2     11     15    35     41
11.9      0      0      0      0     1      1      2     12     10    27     32
12        0      0      0      0     0      0      1      4      8    19     42
12.1      0      0      0      0     0      0      0      2      4    13     22
12.2      0      0      0      0     0      0      1      2      5     6     23
12.3      0      0      0      0     0      0      0      0      4     8     10
12.4      0      0      0      0     0      0      1      1      1     2      7
12.5      0      0      0      0     0      0      0      1      0     1      3
12.6      0      0      0      0     0      0      0      0      0     1      0
12.7      0      0      0      0     0      0      0      0      0     1      1
12.8      0      0      0      0     0      0      0      0      0     0      1
12.9      0      0      0      0     0      0      0      0      0     0      0
13        0      0      0      0     0      0      0      0      0     0      3
13.1      0      0      0      0     0      0      0      0      0     0      0
13.2      0      0      0      0     0      0      0      0      0     0      1
13.3      0      0      0      0     0      0      0      0      0     0      0
13.4      0      0      0      0     0      0      0      0      0     0      0
13.5      0      0      0      0     0      0      0      0      0     0      0
     170.18 172.72 175.26 177.8 180.34 182.88 185.42 187.96 190.5 193.04 195.58
9.4       0      0      0     0      0      0      0      0     0      0      0
9.5       0      0      0     0      0      0      0      0     0      0      0
9.6       0      0      0     0      0      0      0      0     0      0      0
9.7       0      0      0     0      0      0      0      0     0      0      0
9.8       0      0      0     0      0      0      0      0     0      0      0
9.9       0      0      0     0      0      0      0      0     0      0      0
10        0      0      0     0      0      0      0      0     0      0      0
10.1      0      0      0     0      0      0      0      0     0      0      0
10.2      0      0      0     0      0      0      0      0     0      0      0
10.3      0      0      0     0      0      0      0      0     0      0      0
10.4      0      0      0     0      0      0      0      0     0      0      0
10.5      1      0      1     0      0      0      0      0     0      0      0
10.6      0      0      1     0      0      0      0      0     0      0      0
10.7      1      2      0     0      0      0      0      0     0      0      0
10.8      2      1      0     0      0      0      0      0     0      0      0
10.9      4      1      0     0      0      0      0      0     0      0      0
11       10      6      0     0      0      0      0      0     0      0      0
11.1      7      4      1     0      0      0      0      0     0      0      0
11.2     20      4      1     0      0      0      0      0     0      0      1
11.3     24      7      2     0      0      0      0      0     0      0      0
11.4     22     10     11     0      0      0      0      0     0      0      0
11.5     34     25     11     2      0      0      0      0     0      0      0
11.6     38     27     12     2      2      0      1      0     0      0      0
11.7     45     24      9     9      2      0      0      0     0      0      0
11.8     34     29     10     5      1      0      0      0     0      0      0
11.9     35     19     10     9      3      1      0      0     0      0      0
12       39     22     16     8      2      2      0      0     0      0      0
12.1     28     15     27    10      4      1      0      0     0      0      0
12.2     17     16     11     8      1      1      0      0     0      0      0
12.3     13     20     23     6      5      0      0      0     0      0      0
12.4     12      4      7     7      1      0      0      1     0      0      0
12.5     12     11      8     6      8      0      2      0     0      0      0
12.6      3      5      7     8      6      3      1      1     0      0      0
12.7      7      5      5     8      2      2      0      0     0      0      0
12.8      2      3      1     8      5      3      1      1     0      0      0
12.9      1      2      2     0      1      1      0      0     0      0      0
13        0      1      0     1      0      2      1      0     0      0      0
13.1      1      1      0     0      0      0      0      0     0      0      0
13.2      1      0      1     0      3      0      0      0     0      0      0
13.3      0      0      0     0      0      1      0      1     0      0      0
13.4      0      0      0     0      0      0      0      0     0      0      0
13.5      0      0      0     0      0      0      1      0     0      0      0
(r->clj '(attributes crimtab))
{:dim [42 22],
 :dimnames
 [["9.4"
   "9.5"
   "9.6"
   "9.7"
   "9.8"
   "9.9"
   "10"
   "10.1"
   "10.2"
   "10.3"
   "10.4"
   "10.5"
   "10.6"
   "10.7"
   "10.8"
   "10.9"
   "11"
   "11.1"
   "11.2"
   "11.3"
   "11.4"
   "11.5"
   "11.6"
   "11.7"
   "11.8"
   "11.9"
   "12"
   "12.1"
   "12.2"
   "12.3"
   "12.4"
   "12.5"
   "12.6"
   "12.7"
   "12.8"
   "12.9"
   "13"
   "13.1"
   "13.2"
   "13.3"
   "13.4"
   "13.5"]
  ["142.24"
   "144.78"
   "147.32"
   "149.86"
   "152.4"
   "154.94"
   "157.48"
   "160.02"
   "162.56"
   "165.1"
   "167.64"
   "170.18"
   "172.72"
   "175.26"
   "177.8"
   "180.34"
   "182.88"
   "185.42"
   "187.96"
   "190.5"
   "193.04"
   "195.58"]],
 :class ["table"]}
(r->clj r.datasets/crimtab)

_unnamed [924 3]:

:\(col-0 | :\)col-1 :$value
9.4 142.24 0
9.4 144.78 0
9.4 147.32 0
9.4 149.86 0
9.4 152.4 0
9.4 154.94 0
9.4 157.48 1
9.4 160.02 0
9.4 162.56 0
9.4 165.1 0
13.5 170.18 0
13.5 172.72 0
13.5 175.26 0
13.5 177.8 0
13.5 180.34 0
13.5 182.88 0
13.5 185.42 0
13.5 187.96 0
13.5 190.5 0
13.5 193.04 0
13.5 195.58 0

4.3 Matrices, arrays, multidimensional arrays

First two dimensions creates dataset, all additional dimensions are added as columns

r.datasets/VADeaths
      Rural Male Rural Female Urban Male Urban Female
50-54       11.7          8.7       15.4          8.4
55-59       18.1         11.7       24.3         13.6
60-64       26.9         20.3       37.0         19.3
65-69       41.0         30.9       54.6         35.1
70-74       66.0         54.3       71.1         50.0
(r->clj '(attributes VADeaths))
{:dim [5 4],
 :dimnames
 [["50-54" "55-59" "60-64" "65-69" "70-74"]
  ["Rural Male" "Rural Female" "Urban Male" "Urban Female"]]}
(r->clj r.datasets/VADeaths)

_unnamed [5 5]:

:$row.names Rural Male Rural Female Urban Male Urban Female
50-54 11.7 8.7 15.4 8.4
55-59 18.1 11.7 24.3 13.6
60-64 26.9 20.3 37.0 19.3
65-69 41.0 30.9 54.6 35.1
70-74 66.0 54.3 71.1 50.0
r.datasets/freeny-x
      lag quarterly revenue price index income level market potential
 [1,]               8.79636     4.70997      5.82110          12.9699
 [2,]               8.79236     4.70217      5.82558          12.9733
 [3,]               8.79137     4.68944      5.83112          12.9774
 [4,]               8.81486     4.68558      5.84046          12.9806
 [5,]               8.81301     4.64019      5.85036          12.9831
 [6,]               8.90751     4.62553      5.86464          12.9854
 [7,]               8.93673     4.61991      5.87769          12.9900
 [8,]               8.96161     4.61654      5.89763          12.9943
 [9,]               8.96044     4.61407      5.92574          12.9992
[10,]               9.00868     4.60766      5.94232          13.0033
[11,]               9.03049     4.60227      5.95365          13.0099
[12,]               9.06906     4.58960      5.96120          13.0159
[13,]               9.05871     4.57592      5.97805          13.0212
[14,]               9.10698     4.58661      6.00377          13.0265
[15,]               9.12685     4.57997      6.02829          13.0351
[16,]               9.17096     4.57176      6.03475          13.0429
[17,]               9.18665     4.56104      6.03906          13.0497
[18,]               9.23823     4.54906      6.05046          13.0551
[19,]               9.26487     4.53957      6.05563          13.0634
[20,]               9.28436     4.51018      6.06093          13.0693
[21,]               9.31378     4.50352      6.07103          13.0737
[22,]               9.35025     4.49360      6.08018          13.0770
[23,]               9.35835     4.46505      6.08858          13.0849
[24,]               9.39767     4.44924      6.10199          13.0918
[25,]               9.42150     4.43966      6.11207          13.0950
[26,]               9.44223     4.42025      6.11596          13.0984
[27,]               9.48721     4.41060      6.12129          13.1089
[28,]               9.52374     4.41151      6.12200          13.1169
[29,]               9.53980     4.39810      6.13119          13.1222
[30,]               9.58123     4.38513      6.14705          13.1266
[31,]               9.60048     4.37320      6.15336          13.1356
[32,]               9.64496     4.32770      6.15627          13.1415
[33,]               9.64390     4.32023      6.16274          13.1444
[34,]               9.69405     4.30909      6.17369          13.1459
[35,]               9.69958     4.30909      6.16135          13.1520
[36,]               9.68683     4.30552      6.18231          13.1593
[37,]               9.71774     4.29627      6.18768          13.1579
[38,]               9.74924     4.27839      6.19377          13.1625
[39,]               9.77536     4.27789      6.20030          13.1664
(r->clj '(attributes freeny.x))
{:dim [39 4],
 :dimnames
 [nil
  ["lag quarterly revenue"
   "price index"
   "income level"
   "market potential"]]}
(r->clj r.datasets/freeny-x)

_unnamed [39 4]:

lag quarterly revenue price index income level market potential
8.79636 4.70997 5.82110 12.9699
8.79236 4.70217 5.82558 12.9733
8.79137 4.68944 5.83112 12.9774
8.81486 4.68558 5.84046 12.9806
8.81301 4.64019 5.85036 12.9831
8.90751 4.62553 5.86464 12.9854
8.93673 4.61991 5.87769 12.9900
8.96161 4.61654 5.89763 12.9943
8.96044 4.61407 5.92574 12.9992
9.00868 4.60766 5.94232 13.0033
9.53980 4.39810 6.13119 13.1222
9.58123 4.38513 6.14705 13.1266
9.60048 4.37320 6.15336 13.1356
9.64496 4.32770 6.15627 13.1415
9.64390 4.32023 6.16274 13.1444
9.69405 4.30909 6.17369 13.1459
9.69958 4.30909 6.16135 13.1520
9.68683 4.30552 6.18231 13.1593
9.71774 4.29627 6.18768 13.1579
9.74924 4.27839 6.19377 13.1625
9.77536 4.27789 6.20030 13.1664
r.datasets/iris3
, , Setosa

      Sepal L. Sepal W. Petal L. Petal W.
 [1,]      5.1      3.5      1.4      0.2
 [2,]      4.9      3.0      1.4      0.2
 [3,]      4.7      3.2      1.3      0.2
 [4,]      4.6      3.1      1.5      0.2
 [5,]      5.0      3.6      1.4      0.2
 [6,]      5.4      3.9      1.7      0.4
 [7,]      4.6      3.4      1.4      0.3
 [8,]      5.0      3.4      1.5      0.2
 [9,]      4.4      2.9      1.4      0.2
[10,]      4.9      3.1      1.5      0.1
[11,]      5.4      3.7      1.5      0.2
[12,]      4.8      3.4      1.6      0.2
[13,]      4.8      3.0      1.4      0.1
[14,]      4.3      3.0      1.1      0.1
[15,]      5.8      4.0      1.2      0.2
[16,]      5.7      4.4      1.5      0.4
[17,]      5.4      3.9      1.3      0.4
[18,]      5.1      3.5      1.4      0.3
[19,]      5.7      3.8      1.7      0.3
[20,]      5.1      3.8      1.5      0.3
[21,]      5.4      3.4      1.7      0.2
[22,]      5.1      3.7      1.5      0.4
[23,]      4.6      3.6      1.0      0.2
[24,]      5.1      3.3      1.7      0.5
[25,]      4.8      3.4      1.9      0.2
[26,]      5.0      3.0      1.6      0.2
[27,]      5.0      3.4      1.6      0.4
[28,]      5.2      3.5      1.5      0.2
[29,]      5.2      3.4      1.4      0.2
[30,]      4.7      3.2      1.6      0.2
[31,]      4.8      3.1      1.6      0.2
[32,]      5.4      3.4      1.5      0.4
[33,]      5.2      4.1      1.5      0.1
[34,]      5.5      4.2      1.4      0.2
[35,]      4.9      3.1      1.5      0.2
[36,]      5.0      3.2      1.2      0.2
[37,]      5.5      3.5      1.3      0.2
[38,]      4.9      3.6      1.4      0.1
[39,]      4.4      3.0      1.3      0.2
[40,]      5.1      3.4      1.5      0.2
[41,]      5.0      3.5      1.3      0.3
[42,]      4.5      2.3      1.3      0.3
[43,]      4.4      3.2      1.3      0.2
[44,]      5.0      3.5      1.6      0.6
[45,]      5.1      3.8      1.9      0.4
[46,]      4.8      3.0      1.4      0.3
[47,]      5.1      3.8      1.6      0.2
[48,]      4.6      3.2      1.4      0.2
[49,]      5.3      3.7      1.5      0.2
[50,]      5.0      3.3      1.4      0.2

, , Versicolor

      Sepal L. Sepal W. Petal L. Petal W.
 [1,]      7.0      3.2      4.7      1.4
 [2,]      6.4      3.2      4.5      1.5
 [3,]      6.9      3.1      4.9      1.5
 [4,]      5.5      2.3      4.0      1.3
 [5,]      6.5      2.8      4.6      1.5
 [6,]      5.7      2.8      4.5      1.3
 [7,]      6.3      3.3      4.7      1.6
 [8,]      4.9      2.4      3.3      1.0
 [9,]      6.6      2.9      4.6      1.3
[10,]      5.2      2.7      3.9      1.4
[11,]      5.0      2.0      3.5      1.0
[12,]      5.9      3.0      4.2      1.5
[13,]      6.0      2.2      4.0      1.0
[14,]      6.1      2.9      4.7      1.4
[15,]      5.6      2.9      3.6      1.3
[16,]      6.7      3.1      4.4      1.4
[17,]      5.6      3.0      4.5      1.5
[18,]      5.8      2.7      4.1      1.0
[19,]      6.2      2.2      4.5      1.5
[20,]      5.6      2.5      3.9      1.1
[21,]      5.9      3.2      4.8      1.8
[22,]      6.1      2.8      4.0      1.3
[23,]      6.3      2.5      4.9      1.5
[24,]      6.1      2.8      4.7      1.2
[25,]      6.4      2.9      4.3      1.3
[26,]      6.6      3.0      4.4      1.4
[27,]      6.8      2.8      4.8      1.4
[28,]      6.7      3.0      5.0      1.7
[29,]      6.0      2.9      4.5      1.5
[30,]      5.7      2.6      3.5      1.0
[31,]      5.5      2.4      3.8      1.1
[32,]      5.5      2.4      3.7      1.0
[33,]      5.8      2.7      3.9      1.2
[34,]      6.0      2.7      5.1      1.6
[35,]      5.4      3.0      4.5      1.5
[36,]      6.0      3.4      4.5      1.6
[37,]      6.7      3.1      4.7      1.5
[38,]      6.3      2.3      4.4      1.3
[39,]      5.6      3.0      4.1      1.3
[40,]      5.5      2.5      4.0      1.3
[41,]      5.5      2.6      4.4      1.2
[42,]      6.1      3.0      4.6      1.4
[43,]      5.8      2.6      4.0      1.2
[44,]      5.0      2.3      3.3      1.0
[45,]      5.6      2.7      4.2      1.3
[46,]      5.7      3.0      4.2      1.2
[47,]      5.7      2.9      4.2      1.3
[48,]      6.2      2.9      4.3      1.3
[49,]      5.1      2.5      3.0      1.1
[50,]      5.7      2.8      4.1      1.3

, , Virginica

      Sepal L. Sepal W. Petal L. Petal W.
 [1,]      6.3      3.3      6.0      2.5
 [2,]      5.8      2.7      5.1      1.9
 [3,]      7.1      3.0      5.9      2.1
 [4,]      6.3      2.9      5.6      1.8
 [5,]      6.5      3.0      5.8      2.2
 [6,]      7.6      3.0      6.6      2.1
 [7,]      4.9      2.5      4.5      1.7
 [8,]      7.3      2.9      6.3      1.8
 [9,]      6.7      2.5      5.8      1.8
[10,]      7.2      3.6      6.1      2.5
[11,]      6.5      3.2      5.1      2.0
[12,]      6.4      2.7      5.3      1.9
[13,]      6.8      3.0      5.5      2.1
[14,]      5.7      2.5      5.0      2.0
[15,]      5.8      2.8      5.1      2.4
[16,]      6.4      3.2      5.3      2.3
[17,]      6.5      3.0      5.5      1.8
[18,]      7.7      3.8      6.7      2.2
[19,]      7.7      2.6      6.9      2.3
[20,]      6.0      2.2      5.0      1.5
[21,]      6.9      3.2      5.7      2.3
[22,]      5.6      2.8      4.9      2.0
[23,]      7.7      2.8      6.7      2.0
[24,]      6.3      2.7      4.9      1.8
[25,]      6.7      3.3      5.7      2.1
[26,]      7.2      3.2      6.0      1.8
[27,]      6.2      2.8      4.8      1.8
[28,]      6.1      3.0      4.9      1.8
[29,]      6.4      2.8      5.6      2.1
[30,]      7.2      3.0      5.8      1.6
[31,]      7.4      2.8      6.1      1.9
[32,]      7.9      3.8      6.4      2.0
[33,]      6.4      2.8      5.6      2.2
[34,]      6.3      2.8      5.1      1.5
[35,]      6.1      2.6      5.6      1.4
[36,]      7.7      3.0      6.1      2.3
[37,]      6.3      3.4      5.6      2.4
[38,]      6.4      3.1      5.5      1.8
[39,]      6.0      3.0      4.8      1.8
[40,]      6.9      3.1      5.4      2.1
[41,]      6.7      3.1      5.6      2.4
[42,]      6.9      3.1      5.1      2.3
[43,]      5.8      2.7      5.1      1.9
[44,]      6.8      3.2      5.9      2.3
[45,]      6.7      3.3      5.7      2.5
[46,]      6.7      3.0      5.2      2.3
[47,]      6.3      2.5      5.0      1.9
[48,]      6.5      3.0      5.2      2.0
[49,]      6.2      3.4      5.4      2.3
[50,]      5.9      3.0      5.1      1.8

(r->clj '(attributes iris3))
{:dim [50 4 3],
 :dimnames
 [nil
  ["Sepal L." "Sepal W." "Petal L." "Petal W."]
  ["Setosa" "Versicolor" "Virginica"]]}
(r->clj r.datasets/iris3)

_unnamed [150 5]:

:$col-0 Sepal L. Sepal W. Petal L. Petal W.
Setosa 5.1 3.5 1.4 0.2
Setosa 4.9 3.0 1.4 0.2
Setosa 4.7 3.2 1.3 0.2
Setosa 4.6 3.1 1.5 0.2
Setosa 5.0 3.6 1.4 0.2
Setosa 5.4 3.9 1.7 0.4
Setosa 4.6 3.4 1.4 0.3
Setosa 5.0 3.4 1.5 0.2
Setosa 4.4 2.9 1.4 0.2
Setosa 4.9 3.1 1.5 0.1
Virginica 6.9 3.1 5.4 2.1
Virginica 6.7 3.1 5.6 2.4
Virginica 6.9 3.1 5.1 2.3
Virginica 5.8 2.7 5.1 1.9
Virginica 6.8 3.2 5.9 2.3
Virginica 6.7 3.3 5.7 2.5
Virginica 6.7 3.0 5.2 2.3
Virginica 6.3 2.5 5.0 1.9
Virginica 6.5 3.0 5.2 2.0
Virginica 6.2 3.4 5.4 2.3
Virginica 5.9 3.0 5.1 1.8
(def array-5d (r '(array ~(range 60) :dim [2 5 1 3 2])))
array-5d
, , 1, 1, 1

     [,1] [,2] [,3] [,4] [,5]
[1,]    0    2    4    6    8
[2,]    1    3    5    7    9

, , 1, 2, 1

     [,1] [,2] [,3] [,4] [,5]
[1,]   10   12   14   16   18
[2,]   11   13   15   17   19

, , 1, 3, 1

     [,1] [,2] [,3] [,4] [,5]
[1,]   20   22   24   26   28
[2,]   21   23   25   27   29

, , 1, 1, 2

     [,1] [,2] [,3] [,4] [,5]
[1,]   30   32   34   36   38
[2,]   31   33   35   37   39

, , 1, 2, 2

     [,1] [,2] [,3] [,4] [,5]
[1,]   40   42   44   46   48
[2,]   41   43   45   47   49

, , 1, 3, 2

     [,1] [,2] [,3] [,4] [,5]
[1,]   50   52   54   56   58
[2,]   51   53   55   57   59

(r->clj '(attributes ~array-5d))
{:dim [2 5 1 3 2]}
(r->clj array-5d)

_unnamed [12 8]:

:\(col-0 | :\)col-1 :$col-2 1 2 3 4 5
1 1 1 0 2 4 6 8
1 1 1 1 3 5 7 9
1 1 2 10 12 14 16 18
1 1 2 11 13 15 17 19
1 2 1 20 22 24 26 28
1 2 1 21 23 25 27 29
1 2 2 30 32 34 36 38
1 2 2 31 33 35 37 39
1 3 1 40 42 44 46 48
1 3 1 41 43 45 47 49
1 3 2 50 52 54 56 58
1 3 2 51 53 55 57 59

4.4 1D timeseries

Timeseries are stored in two columns:

  • :$time - to store time identifier as double *:$series - to store timeseries
r.datasets/BJsales
Time Series:
Start = 1 
End = 150 
Frequency = 1 
  [1] 200.1 199.5 199.4 198.9 199.0 200.2 198.6 200.0 200.3 201.2 201.6 201.5
 [13] 201.5 203.5 204.9 207.1 210.5 210.5 209.8 208.8 209.5 213.2 213.7 215.1
 [25] 218.7 219.8 220.5 223.8 222.8 223.8 221.7 222.3 220.8 219.4 220.1 220.6
 [37] 218.9 217.8 217.7 215.0 215.3 215.9 216.7 216.7 217.7 218.7 222.9 224.9
 [49] 222.2 220.7 220.0 218.7 217.0 215.9 215.8 214.1 212.3 213.9 214.6 213.6
 [61] 212.1 211.4 213.1 212.9 213.3 211.5 212.3 213.0 211.0 210.7 210.1 211.4
 [73] 210.0 209.7 208.8 208.8 208.8 210.6 211.9 212.8 212.5 214.8 215.3 217.5
 [85] 218.8 220.7 222.2 226.7 228.4 233.2 235.7 237.1 240.6 243.8 245.3 246.0
 [97] 246.3 247.7 247.6 247.8 249.4 249.0 249.9 250.5 251.5 249.0 247.6 248.8
[109] 250.4 250.7 253.0 253.7 255.0 256.2 256.0 257.4 260.4 260.0 261.3 260.4
[121] 261.6 260.8 259.8 259.0 258.9 257.4 257.7 257.9 257.4 257.3 257.6 258.9
[133] 257.8 257.7 257.2 257.5 256.8 257.5 257.0 257.6 257.3 257.5 259.6 261.1
[145] 262.9 263.3 262.8 261.8 262.2 262.7
(r->clj '(attributes BJsales))
{:tsp [1.0 150.0 1.0], :class ["ts"]}
(r->clj r.datasets/BJsales)

_unnamed [149 2]:

:\(time | :\)series
1.0 200.1
2.0 199.5
3.0 199.4
4.0 198.9
5.0 199.0
6.0 200.2
7.0 198.6
8.0 200.0
9.0 200.3
10.0 201.2
139.0 257.0
140.0 257.6
141.0 257.3
142.0 257.5
143.0 259.6
144.0 261.1
145.0 262.9
146.0 263.3
147.0 262.8
148.0 261.8
149.0 262.2

4.5 Multidimensional timeseries

(r '(window EuStockMarkets :end [1991,155]))
Time Series:
Start = c(1991, 130) 
End = c(1991, 155) 
Frequency = 260 
             DAX    SMI    CAC   FTSE
1991.496 1628.75 1678.1 1772.8 2443.6
1991.500 1613.63 1688.5 1750.5 2460.2
1991.504 1606.51 1678.6 1718.0 2448.2
1991.508 1621.04 1684.1 1708.1 2470.4
1991.512 1618.16 1686.6 1723.1 2484.7
1991.515 1610.61 1671.6 1714.3 2466.8
1991.519 1630.75 1682.9 1734.5 2487.9
1991.523 1640.17 1703.6 1757.4 2508.4
1991.527 1635.47 1697.5 1754.0 2510.5
1991.531 1645.89 1716.3 1754.3 2497.4
1991.535 1647.84 1723.8 1759.8 2532.5
1991.538 1638.35 1730.5 1755.5 2556.8
1991.542 1629.93 1727.4 1758.1 2561.0
1991.546 1621.49 1733.3 1757.5 2547.3
1991.550 1624.74 1734.0 1763.5 2541.5
1991.554 1627.63 1728.3 1762.8 2558.5
1991.558 1631.99 1737.1 1768.9 2587.9
1991.562 1621.18 1723.1 1778.1 2580.5
1991.565 1613.42 1723.6 1780.1 2579.6
1991.569 1604.95 1719.0 1767.7 2589.3
1991.573 1605.75 1721.2 1757.9 2595.0
1991.577 1616.67 1725.3 1756.6 2595.6
1991.581 1619.29 1727.2 1754.7 2588.8
1991.585 1620.49 1727.2 1766.8 2591.7
1991.588 1619.67 1731.6 1766.5 2601.7
1991.592 1623.07 1724.1 1762.2 2585.4
(r->clj '(attributes EuStockMarkets))
{:dim [1860 4],
 :dimnames [nil ["DAX" "SMI" "CAC" "FTSE"]],
 :tsp [1991.496153846154 1998.646153846154 260.0],
 :class ["mts" "ts" "matrix"]}
(r->clj r.datasets/EuStockMarkets)

_unnamed [1860 5]:

:$time DAX SMI CAC FTSE
1991.49615385 1628.75 1678.1 1772.8 2443.6
1991.50000000 1613.63 1688.5 1750.5 2460.2
1991.50384615 1606.51 1678.6 1718.0 2448.2
1991.50769231 1621.04 1684.1 1708.1 2470.4
1991.51153846 1618.16 1686.6 1723.1 2484.7
1991.51538462 1610.61 1671.6 1714.3 2466.8
1991.51923077 1630.75 1682.9 1734.5 2487.9
1991.52307692 1640.17 1703.6 1757.4 2508.4
1991.52692308 1635.47 1697.5 1754.0 2510.5
1991.53076923 1645.89 1716.3 1754.3 2497.4
1998.60769231 5861.19 8239.5 4177.3 5837.0
1998.61153846 5774.38 8139.2 4095.0 5809.7
1998.61538462 5718.70 8170.2 4047.9 5736.1
1998.61923077 5614.77 7943.2 3976.4 5632.5
1998.62307692 5528.12 7846.2 3968.6 5594.1
1998.62692308 5598.32 7952.9 4041.9 5680.4
1998.63076923 5460.43 7721.3 3939.5 5587.6
1998.63461538 5285.78 7447.9 3846.0 5432.8
1998.63846154 5386.94 7607.5 3945.7 5462.2
1998.64230769 5355.03 7552.6 3951.7 5399.5
1998.64615385 5473.72 7676.3 3995.0 5455.0

4.6 Datetime columns

(def dt (r "
   day <- c(\"20081101\", \"20081101\", \"20081101\", \"20081101\", \"18081101\", \"20081102\", \"20081102\", \"20081102\", \"20081102\", \"20081103\")
   time <- c(\"01:20:00\", \"06:00:00\", \"12:20:00\", \"17:30:00\", \"21:45:00\", \"01:15:00\", \"06:30:00\", \"12:50:00\", \"20:00:00\", \"01:05:00\")
   dts1 <- paste(day, time)
   dts2 <- as.POSIXct(dts1, format = \"%Y%m%d %H:%M:%S\")
   dts3 <- as.POSIXlt(dts1, format = \"%Y%m%d %H:%M:%S\")
   dts <- data.frame(posixct=dts2, posixlt=dts3)"))
dt
               posixct             posixlt
1  2008-11-01 01:20:00 2008-11-01 01:20:00
2  2008-11-01 06:00:00 2008-11-01 06:00:00
3  2008-11-01 12:20:00 2008-11-01 12:20:00
4  2008-11-01 17:30:00 2008-11-01 17:30:00
5  1808-11-01 21:45:00 1808-11-01 21:45:00
6  2008-11-02 01:15:00 2008-11-02 01:15:00
7  2008-11-02 06:30:00 2008-11-02 06:30:00
8  2008-11-02 12:50:00 2008-11-02 12:50:00
9  2008-11-02 20:00:00 2008-11-02 20:00:00
10 2008-11-03 01:05:00 2008-11-03 01:05:00
(r->clj '(attributes ~dt))
{:names ["posixct" "posixlt"],
 :class ["data.frame"],
 :row.names [1 2 3 4 5 6 7 8 9 10]}
(r->clj dt)

_unnamed [10 2]:

:posixct :posixlt
2008-11-01T01:20 2008-11-01T01:20
2008-11-01T06:00 2008-11-01T06:00
2008-11-01T12:20 2008-11-01T12:20
2008-11-01T17:30 2008-11-01T17:30
1808-11-01T21:45 1808-11-01T21:45
2008-11-02T01:15 2008-11-02T01:15
2008-11-02T06:30 2008-11-02T06:30
2008-11-02T12:50 2008-11-02T12:50
2008-11-02T20:00 2008-11-02T20:00
2008-11-03T01:05 2008-11-03T01:05

4.7 Distances

r.datasets/UScitiesD
              Atlanta Chicago Denver Houston LosAngeles Miami NewYork
Chicago           587                                                
Denver           1212     920                                        
Houston           701     940    879                                 
LosAngeles       1936    1745    831    1374                         
Miami             604    1188   1726     968       2339              
NewYork           748     713   1631    1420       2451  1092        
SanFrancisco     2139    1858    949    1645        347  2594    2571
Seattle          2182    1737   1021    1891        959  2734    2408
Washington.DC     543     597   1494    1220       2300   923     205
              SanFrancisco Seattle
Chicago                           
Denver                            
Houston                           
LosAngeles                        
Miami                             
NewYork                           
SanFrancisco                      
Seattle                678        
Washington.DC         2442    2329
(r->clj '(attributes UScitiesD))
{:Labels
 ["Atlanta"
  "Chicago"
  "Denver"
  "Houston"
  "LosAngeles"
  "Miami"
  "NewYork"
  "SanFrancisco"
  "Seattle"
  "Washington.DC"],
 :Size [10],
 :call {0 as.dist.default, :m [t cities.mat]},
 :class ["dist"],
 :Diag [false],
 :Upper [false]}
(r->clj r.datasets/UScitiesD)

_unnamed [10 11]:

:$row.names Atlanta Chicago Denver Houston LosAngeles Miami NewYork SanFrancisco Seattle Washington.DC
Atlanta 0 587 1212 701 1936 604 748 2139 2182 543
Chicago 587 0 920 940 1745 1188 713 1858 1737 597
Denver 1212 920 0 879 831 1726 1631 949 1021 1494
Houston 701 940 879 0 1374 968 1420 1645 1891 1220
LosAngeles 1936 1745 831 1374 0 2339 2451 347 959 2300
Miami 604 1188 1726 968 2339 0 1092 2594 2734 923
NewYork 748 713 1631 1420 2451 1092 0 2571 2408 205
SanFrancisco 2139 1858 949 1645 347 2594 2571 0 678 2442
Seattle 2182 1737 1021 1891 959 2734 2408 678 0 2329
Washington.DC 543 597 1494 1220 2300 923 205 2442 2329 0

4.8 Other

4.8.1 List

r.datasets/Harman23-cor
$cov
               height arm.span forearm lower.leg weight bitro.diameter
height          1.000    0.846   0.805     0.859  0.473          0.398
arm.span        0.846    1.000   0.881     0.826  0.376          0.326
forearm         0.805    0.881   1.000     0.801  0.380          0.319
lower.leg       0.859    0.826   0.801     1.000  0.436          0.329
weight          0.473    0.376   0.380     0.436  1.000          0.762
bitro.diameter  0.398    0.326   0.319     0.329  0.762          1.000
chest.girth     0.301    0.277   0.237     0.327  0.730          0.583
chest.width     0.382    0.415   0.345     0.365  0.629          0.577
               chest.girth chest.width
height               0.301       0.382
arm.span             0.277       0.415
forearm              0.237       0.345
lower.leg            0.327       0.365
weight               0.730       0.629
bitro.diameter       0.583       0.577
chest.girth          1.000       0.539
chest.width          0.539       1.000

$center
[1] 0 0 0 0 0 0 0 0

$n.obs
[1] 305

(r->clj '(attributes Harman23.cor))
{:names ["cov" "center" "n.obs"]}
(r->clj r.datasets/Harman23-cor)
{:cov
 [1.0
  0.846
  0.805
  0.859
  0.473
  0.398
  0.301
  0.382
  0.846
  1.0
  0.881
  0.826
  0.376
  0.326
  0.277
  0.415
  0.805
  0.881
  1.0
  0.801
  0.38
  0.319
  0.237
  0.345
  0.859
  0.826
  0.801
  1.0
  0.436
  0.329
  0.327
  0.365
  0.473
  0.376
  0.38
  0.436
  1.0
  0.762
  0.73
  0.629
  0.398
  0.326
  0.319
  0.329
  0.762
  1.0
  0.583
  0.577
  0.301
  0.277
  0.237
  0.327
  0.73
  0.583
  1.0
  0.539
  0.382
  0.415
  0.345
  0.365
  0.629
  0.577
  0.539
  1.0],
 :center [0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0],
 :n.obs [305.0]}

4.8.2 Partially named list

(def pnl (r '[:!list :a 112 "abc" "cde" :b "qwe"]))
pnl
$a
[1] 112

[[2]]
[1] "abc"

[[3]]
[1] "cde"

$b
[1] "qwe"

(r->clj '(attributes ~pnl))
{:names ["a" "" "" "b"]}
(r->clj pnl)
{:a [112], 1 ["abc"], 2 ["cde"], :b ["qwe"]}

4.9 Dataset -> R

Every dataset is converted to data.frame object.

(clj->r (r->clj r.datasets/UScitiesD))
              Atlanta Chicago Denver Houston LosAngeles Miami NewYork
Atlanta             0     587   1212     701       1936   604     748
Chicago           587       0    920     940       1745  1188     713
Denver           1212     920      0     879        831  1726    1631
Houston           701     940    879       0       1374   968    1420
LosAngeles       1936    1745    831    1374          0  2339    2451
Miami             604    1188   1726     968       2339     0    1092
NewYork           748     713   1631    1420       2451  1092       0
SanFrancisco     2139    1858    949    1645        347  2594    2571
Seattle          2182    1737   1021    1891        959  2734    2408
Washington.DC     543     597   1494    1220       2300   923     205
              SanFrancisco Seattle Washington.DC
Atlanta               2139    2182           543
Chicago               1858    1737           597
Denver                 949    1021          1494
Houston               1645    1891          1220
LosAngeles             347     959          2300
Miami                 2594    2734           923
NewYork               2571    2408           205
SanFrancisco             0     678          2442
Seattle                678       0          2329
Washington.DC         2442    2329             0
source: notebooks/clojisr/v1/tutorials/dataset.clj