9 API Reference
Complete reference for every public function and constant in the zulipdata library:
scicloj.zulipdata.client— REST client for the Clojurians Zulip instance.scicloj.zulipdata.pull— paginated, cached pulls of channel history.scicloj.zulipdata.views— tablecloth projections of raw messages.scicloj.zulipdata.anonymize— anonymized projections suitable for sharing.scicloj.zulipdata.narrative— date helpers, channel lifecycles, newcomer tracking.scicloj.zulipdata.graph— co-membership and co-presence graphs, community detection, rendering.
Each entry shows the docstring, a live example, and a test. The namespace links above lead to the conceptual walkthrough for each — read those for context; this chapter is the API reference.
Sample data
A small pull, reused across every example below. Each layer of the pipeline is bound for direct reuse: sample-pull (raw pull result), sample-messages (flat seq of raw messages), sample-timeline (plain tablecloth view), sample-anon (anonymized), sample-with-time (anonymized + date columns).
(def sample-channels
["clojurecivitas" "scicloj-webpublic" "gratitude" "events"])(def sample-pull
(pull/pull-channels! sample-channels))(def sample-messages
(->> sample-pull
(filter (fn [[k _]] (string? k)))
(mapcat (fn [[_ r]] (pull/all-messages r)))))(def sample-timeline
(views/messages-timeline sample-messages))(def sample-anon
(anon/anonymized-timeline sample-messages))(def sample-with-time
(nar/with-time-columns sample-anon))scicloj.zulipdata.client
base-url
API root for the Clojurians Zulip instance. All api-get paths are resolved relative to this prefix.
client/base-url"https://clojurians.zulipchat.com/api/v1"api-get
[path]
[path query-params]
Authenticated GET against the Clojurians Zulip API. path is resolved relative to base-url; query-params is an optional map. Wraps the request in a small retry loop with longer waits between retries and a 90-second per-request timeout. Returns the JSON body parsed with keyword keys.
(-> (client/api-get "/server_settings")
:realm_name)"Clojurians"With query parameters:
(-> (client/api-get "/messages"
{"narrow" (charred.api/write-json-str
[{:operator "channel" :operand "clojurecivitas"}])
"anchor" "newest"
"num_before" 1
"num_after" 0})
:messages count)1whoami
[]
Calls /users/me and returns a short summary of the authenticated identity. Use this after configuring credentials to confirm everything works before running a pull.
(client/whoami){:email "user138175@clojurians.zulipchat.com",
:full-name "Daniel Slutsky",
:user-id 138175,
:is-bot false,
:is-admin true,
:role 100}get-me
[]
Full /users/me response for the authenticated account. Use whoami for a trimmed summary.
(-> (client/get-me) :user_id integer?)trueget-streams
[]
Full /streams response — every stream the authenticated user can see. Returns the raw Zulip API map; the stream entries live under :streams.
(-> (client/get-streams) :streams count pos?)trueget-messages
[{:keys [narrow anchor num-before num-after apply-markdown], :or {anchor "newest", num-before 100, num-after 0, apply-markdown false}}]
Fetch messages matching a narrow. narrow is a vector of maps, e.g. [{:operator “channel” :operand “data-science”}]. anchor may be “newest”, “oldest”, “first_unread”, or a message id. Returns up to num-before + num-after + 1 messages around the anchor.
(-> (client/get-messages
{:narrow [{:operator "channel" :operand "clojurecivitas"}]
:anchor "newest"
:num-before 3
:num-after 0})
:messages count)3scicloj.zulipdata.pull
default-batch-size
Messages requested per window when pull-channel! is called without an explicit :batch-size. 5000 is also Zulip’s per-request cap.
pull/default-batch-size5000fetch-window
[stream-name anchor-id batch-size]
Cached forward window. Returns the deref’d page map.
(-> (pull/fetch-window "clojurecivitas" 0 100)
:messages count)100pull-channel!
[stream-name start-anchor-id & {:keys [batch-size refresh], :or {batch-size default-batch-size}}]
Walk forward through stream-name in cached windows, starting at start-anchor-id. Returns {:pages [...], :message-count n}.
Options: :batch-size — messages per window (default 5000) :refresh — when true, any cached page with found_newest: true is invalidated and re-fetched once, then the walk continues if new full windows appeared. Use to catch up after messages were posted since the last pull.
With :refresh false (default), repeated calls are served entirely from cache.
A complete walk from id zero to the channel’s tip. Result keys:
(-> (pull/pull-channel! "clojurecivitas" 0)
(select-keys [:pages :message-count])
keys
set)#{:pages :message-count}all-messages
[pull-result]
Flatten the :pages result of pull-channel! into a single sequence of messages, de-duplicating by :id (windows are non-overlapping by construction; this is a redundant safety check).
(let [walk (pull/pull-channel! "clojurecivitas" 0)
messages (pull/all-messages walk)]
(= (count messages) (:message-count walk)))truepull-channels!
[channel-names & {:keys [batch-size refresh parallelism], :as opts, :or {parallelism default-parallelism}}]
Pull a collection of channels by name. Returns a map {channel-name {:pages ... :message-count ... :stream-id ... :first-message-id ...}}.
First-message ids are resolved from /streams. Any unknown channel names are returned under key :not-found as a vector.
Options: :batch-size — passed through to pull-channel! (default 5000) :refresh — passed through to pull-channel! :parallelism — number of channels to pull concurrently (default default-parallelism, currently 8). Pass 1 for fully sequential pulls.
Successful entries are keyed by name; unknown names land in :not-found.
(-> (pull/pull-channels! ["clojurecivitas" "no-such-channel"])
:not-found)["no-such-channel"]public-channel-names
[]
Names of all channels visible to the bot that are either public or web-public.
(-> (pull/public-channel-names) count pos?)truepull-public-channels!
[& opts]
Convenience: pull every public + web-public channel visible to the bot. Same options as pull-channels!.
Not run here — a fresh full-corpus pull can take minutes. Pulls every name returned by pull/public-channel-names and accepts the same options as pull-channels!.
scicloj.zulipdata.views
messages-timeline
[messages]
One row per message — simple-valued fields only. Good for activity-over-time and sender/topic analyses.
(-> (views/messages-timeline sample-messages)
tc/row-count)1096The columns:
(-> sample-timeline tc/column-names sort)(:channel
:client
:content
:content-length
:edited
:id
:instant
:last-edit-ts
:sender
:sender-id
:stream-id
:subject
:timestamp)reactions-long
[messages]
One row per (message, reaction). Fields: message-id, stream-id, channel, subject, emoji-name, emoji-code, reaction-type, user-id, message-ts.
(-> (views/reactions-long sample-messages)
tc/column-names sort)(:channel
:emoji-code
:emoji-name
:message-id
:message-ts
:reaction-type
:stream-id
:subject
:user-id)edits-long
[messages]
One row per edit event in :edit_history. Note: some edits record only topic/stream moves (no :prev_content); we include prev-content as-is.
(-> (views/edits-long sample-messages)
tc/column-names sort)(:channel
:edit-ts
:edit-user-id
:message-id
:prev-content
:prev-stream
:prev-subject
:stream-id)topic-links-long
[messages]
One row per auto-linked URL inside a message.
(-> (views/topic-links-long sample-messages)
tc/column-names sort)(:channel :link-text :link-url :message-id :stream-id)scicloj.zulipdata.anonymize
user-key
[sender-id]
Stable, irreversible 16-hex-char identifier for a sender id.
(anon/user-key 42)"62b81b15a6414d9b"Stable across calls; nil passes through:
[(= (anon/user-key 42) (anon/user-key 42))
(anon/user-key nil)][true nil]subject-key
[subject]
Stable 16-hex-char identifier for a topic/subject string. Wide enough that two distinct subjects almost never collide, so the key uniquely identifies a topic given the full corpus.
(anon/subject-key "channel introductions")"b61cd3d678d6f0da"anonymized-timeline
[messages]
One row per message, anonymized. Sender identity and subject are replaced by stable hash keys; message text is replaced by length only. Reaction count is kept; the per-emoji breakdown lives in anonymized-reactions.
(-> (anon/anonymized-timeline sample-messages)
tc/column-names sort)(:channel
:client
:content-length
:edited
:id
:last-edit-ts
:reaction-count
:stream-id
:subject-key
:timestamp
:user-key)anonymized-reactions
[messages]
One row per (message, reaction). Both the message author’s subject and the reactor’s identity are anonymized; the emoji name is preserved (it captures community sentiment, not message content).
(-> (anon/anonymized-reactions sample-messages)
tc/column-names sort)(:channel
:emoji-code
:emoji-name
:message-id
:message-ts
:reaction-type
:reactor-user-key
:stream-id
:subject-key)anonymized-edits
[messages]
One row per edit event. Editor and prior subject are anonymized; prior content is dropped. prev-stream is left as-is — it is a stream id, not personal data.
(-> (anon/anonymized-edits sample-messages)
tc/column-names sort)(:channel
:edit-ts
:editor-user-key
:message-id
:prev-stream
:prev-subject-key
:stream-id)scicloj.zulipdata.narrative
ts->month-date
[ts]
Epoch-second -> first-of-month LocalDate (UTC).
(nar/ts->month-date 1725611765)#object[java.time.LocalDate 0xe211652 "2024-09-01"]ts->year-month
[ts]
Epoch-second -> “YYYY-MM” string (UTC).
(nar/ts->year-month 1725611765)"2024-09"ts->year
[ts]
Epoch-second -> integer year (UTC).
(nar/ts->year 1725611765)2024with-time-columns
[timeline]
Add :month-date, :year-month, and :year columns to a timeline that has a :timestamp column (epoch seconds).
(-> (nar/with-time-columns sample-anon)
tc/column-names
set
(clojure.set/intersection #{:month-date :year-month :year}))#{:month-date :year :year-month}channel-lifecycle
[timeline]
One row per channel: first-date, last-date, total messages, active months, distinct users. Sorted ascending by first-date by default.
(-> (nar/channel-lifecycle sample-with-time)
tc/column-names sort)(:active-months :channel :distinct-users :first-date :last-date :total)channels-by-name-pattern
[timeline regex]
Channels whose name matches regex.
(nar/channels-by-name-pattern sample-with-time #"civitas|gratitude")["clojurecivitas" "gratitude"]first-posters-of-channel
[timeline channel n]
First n distinct user-keys to post in channel, with their first-post date. Useful for identifying a channel’s earliest contributors.
(-> (nar/first-posters-of-channel sample-with-time "clojurecivitas" 5)
tc/column-names sort)(:first-post-date :user-key)prior-channels-of-newcomers
[timeline channel year-month]
For users whose first-ever post in channel falls in the given year-month (“YYYY-MM”), tally the channels they had posted in before that first post. Returns one row per (prior-channel) with counts of how many newcomers passed through it.
(-> (nar/prior-channels-of-newcomers sample-with-time "clojurecivitas" "2025-10")
tc/column-names sort)(:newcomers-touched :prior-channel)channel-monthly-activity
[timeline]
[timeline channels]
Long-form: one row per (channel, month-date) with :msgs count. Restricted to channels if supplied, else all channels.
(-> (nar/channel-monthly-activity sample-with-time #{"clojurecivitas"})
tc/column-names sort)(:channel :month-date :msgs)scicloj.zulipdata.graph
user-channel-sets
[timeline]
[timeline min-channels]
Map of user-key → set of channels they posted in. Drops users with fewer than min-channels channels (default 1).
Map of user-key to the set of channels they posted in.
(let [u->c (graph/user-channel-sets sample-with-time)
[_ chans] (first u->c)]
(set? chans))truechannel-comembership-graph
[timeline & {:keys [min-shared], :or {min-shared 1}}]
Undirected weighted graph: nodes are channels, edges weighted by shared user count. min-shared filters out edges with fewer than N shared users.
(let [g (graph/channel-comembership-graph sample-with-time :min-shared 1)]
(= (set sample-channels) (.vertexSet g)))trueuser-copresence-graph
[timeline & {:keys [min-shared min-channels], :or {min-shared 3, min-channels 3}}]
Undirected weighted graph: nodes are users, edges weighted by shared channel count. min-shared filters edges; min-channels filters users (active in ≥ N channels).
(let [g (graph/user-copresence-graph sample-with-time
:min-shared 2 :min-channels 2)]
(pos? (count (.vertexSet g))))truemigration-graph
[timeline from-set & {:keys [min-users], :or {min-users 3}}]
Directed weighted graph: edge from from-channel to to-channel weighted by the number of users who posted in from-channel and later (after their last post in any from-set channel) posted in to-channel. Excludes self-loops and edges within from-set.
Only users with at least 5 posts in from-set are considered.
Edges from each from-set source to channels users moved to next. With clojurecivitas as the seed, no self-loops:
(let [g (graph/migration-graph sample-with-time #{"clojurecivitas"} :min-users 1)]
(every? (fn [e] (not= (.getEdgeSource g e) (.getEdgeTarget g e)))
(.edgeSet g)))truebetweenness
[g]
Map node → betweenness centrality score.
(let [g (graph/channel-comembership-graph sample-with-time)
scores (graph/betweenness g)]
(= (.vertexSet g) (set (keys scores))))truegirvan-newman
[g k]
Vector of node-sets, one per cluster. k is the desired number of clusters.
(let [g (graph/channel-comembership-graph sample-with-time)
clusters (graph/girvan-newman g 2)]
(count clusters))2label-propagation
[g]
Vector of node-sets — communities found by label propagation (number of clusters chosen by the algorithm).
(let [g (graph/channel-comembership-graph sample-with-time)
clusters (graph/label-propagation g)]
(every? set? clusters))true->cytoscape-elements
[g & {:keys [node-attrs edge-attrs], :or {node-attrs (constantly {}), edge-attrs (constantly {})}}]
Convert a JGraphT graph to a :elements map for kind/cytoscape. node-attrs and edge-attrs are optional fns of the node / [u v weight] returning a map of extra attributes (merged into :data).
(let [g (graph/channel-comembership-graph sample-with-time)
e (graph/->cytoscape-elements g)]
(set (keys e)))#{:nodes :edges}->dot
[g & {:keys [directed node-label edge-label name], :or {directed true, node-label str, edge-label (constantly nil), name "G"}}]
Render a JGraphT graph as Graphviz DOT source. directed chooses between digraph/graph. node-label and edge-label are optional fns producing label strings.
(let [g (graph/channel-comembership-graph sample-with-time)
dot (graph/->dot g :directed false)]
(and (string? dot)
(clojure.string/starts-with? dot "graph ")))true