Home
Softono
buran

buran

Open source MIT Clojure
32
Stars
1
Forks
3
Issues
2
Watchers
2 months
Last Commit

About buran

RSS/Atom Feed Parsing and Generating for Clojure. Bidirectional. Data-driven.

Platforms

Web Self-hosted

Languages

Clojure

Links

Buran (meaning "Snowstorm" or "Blizzard") was the first spaceplane to be produced as part of the Soviet/Russian Buran programme. Wikipedia

Buran πŸŒ€

Parse and generate RSS/Atom feeds in Clojure

Clojars Project CircleCI codecov CodeScene System Mastery CodeScene Code Health

Buran is a bidirectional feed library: parse any RSS/Atom feed into Clojure data structures, transform them with standard functions, and produce feeds in any format. Built on ROME Tools with a data-driven approach.

Buran can be used as an aggregator for various feed formats, converting them into regular Clojure data structures. When consuming a feed, Buran creates a map, which can be read or manipulated using regular functions such as filter, sort, assoc, dissoc, and more. After the modifications, Buran can generate your own feed, for example, in a different format (RSS 2.0, 1.0, 0.9x or Atom 1.0, 0.3).

Quick Start

;; Add to deps.edn
{:deps {buran/buran {:mvn/version "0.1.4"}}}

;; Or to project.clj
[buran "0.1.4"]

;; In your namespace
(ns your.app
  (:require [buran.core :as buran]))

;; Parse a feed
(def data (buran/consume-http "https://stackoverflow.com/feeds/tag?tagnames=clojure"))

;; Generate a feed
(buran/produce {:info {:feed-type "atom_1.0" :title "My Feed"}
                :entries [{:title "Hello" :description {:value "World"}}]})

Usage

Regardless of the feed format you are working with and whether you want to consume or produce a new feed, Buran uses the same data structure every time. Buran's API is concise, with functions such as consume, consume-http, produce, and some helpers to manipulate feeds, including combine-feeds, filter-entries, sort-entries-by and shrink. The basic workflow involves passing the data structure to the API functions repeatedly. See the documentation for Various options and details.

examples

Consume a feed from String

(def feed "<?xml version=\"1.0\" encoding=\"UTF-8\"?>
           <feed xmlns=\"http://www.w3.org/2005/Atom\">
             <title>Feed title</title>
             <subtitle />
             <entry>
               <title>Entry title</title>
               <author>
                 <name />
               </author>
               <summary>entry description</summary>
             </entry>
           </feed>
           ")
(shrink (consume feed))
=>
{:info    {:feed-type "atom_1.0", 
           :title     "Feed title"},
 :entries [{:title       "Entry title", 
            :description {:value "entry description"}}]}

Produce a feed

(def feed {:info {:feed-type "atom_1.0"
                  :title     "Feed title"}
           :entries [{:title       "Entry title"
                      :description {:value "entry description"}}]})
(produce feed)
=>
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>
 <feed xmlns=\"http://www.w3.org/2005/Atom\">\r
   <title>Feed title</title>\r
   <subtitle />\r
   <entry>\r
     <title>Entry title</title>\r
     <author>\r
       <name />\r
     </author>\r
     <summary>entry description</summary>\r
   </entry>\r
 </feed>
 "

Consume a feed over http

(consume-http "https://stackoverflow.com/feeds/tag?tagnames=clojure")
=>
{:info {...},
 :entries [...],
 :foreign-markup [...]}

Shrink a feed (remove nils, empty colls, maps and etc.)

(shrink (consume-http "https://stackoverflow.com/feeds/tag?tagnames=clojure"))
=>
{:info {:description "most recent 30 from stackoverflow.com",
        :feed-type "atom_1.0",
        :published-date #inst"2018-08-20T08:03:33.000-00:00",
        :title "Active questions tagged clojure - Stack Overflow",
        :link "https://stackoverflow.com/questions/tagged/?tagnames=clojure&sort=active",
        :uri "https://stackoverflow.com/feeds/tag?tagnames=clojure",
        :links [{:href "https://stackoverflow.com/questions/tagged/?tagnames=clojure&sort=active",
                 :type "text/html",
                 :rel "alternate",
                 :length 0}, ...]},
 :entries [{:description {:type "html", :value "<p>..."},
            :updated-date #inst"2018-08-20T06:16:12.000-00:00",
            :foreign-markup [...],
            :published-date #inst"2018-08-20T05:54:39.000-00:00",
            :title "Clojure evaluate lazy sequence",
            :author "Constantine",
            :categories [{:name "clojure", :taxonomy-uri "https://stackoverflow.com/tags"}, ...],
            :link "https://stackoverflow.com/questions/51924808/clojure-evaluate-lazy-sequence",
            :uri "https://stackoverflow.com/q/51924808",
            :authors [{:name "Constantine", :uri "https://stackoverflow.com/users/4201205"}],
            :links [{:href "https://stackoverflow.com/questions/51924808/clojure-evaluate-lazy-sequence",
                     :rel "alternate",
                     :length 0}]}, ...],
 :foreign-markup [...]}

Supported Formats

Format Parse Generate Notes
Atom 1.0 βœ… βœ… Full support
Atom 0.3 βœ… βœ… Legacy
RSS 2.0 βœ… βœ… Most common
RSS 1.0 βœ… βœ… RDF-based
RSS 0.9x βœ… βœ… Various variants
RSS 0.9 βœ… βœ… Original

Basic API Reference

consume

Parse a feed from string, file, reader, or other sources.

;; Shortcut
(consume "<?xml version=\"1.0\"?><feed>...</feed>")

;; With options
(consume {:from             (java.io.File. "~/feed.xml") 
                                        ; String, File, Reader, W3C DOM document, JDOM document, W3C SAX InputSource
          :validate         false       ; Indicates if the input should be validated
          :locale           (Locale/US) ; java.util.Locale
          :xml-healer-on    true        ; Healing trims leading chars from the stream (empty spaces and comments) until the XML prolog.
                                        ; Healing resolves HTML entities (from literal to code number) in the reader.
                                        ; The healing is done only with the File and Reader.
          :allow-doctypes   false       ; You should only activate it when the feeds that you process are absolutely trustful
          :throw-exception  false       ; false - return map with an exception, throw an exception otherwise
         })
Option Type Default Description
:from String, File, Reader, InputStream, W3C DOM, JDOM, SAX InputSource required Source to parse
:validate boolean false Validate XML against DTD/schema
:locale java.util.Locale (Locale/US) Locale for parsing
:xml-healer-on boolean true Trim whitespace/comments before XML prolog; resolve HTML entities
:allow-doctypes boolean false Allow DOCTYPE declarations (⚠️ security risk - only for trusted sources)
:throw-exception boolean false If false, return error map; if true, throw exception

consume-http

Fetch and parse a feed over HTTP.

;; Shortcut
(consume-http "https://example.com/feed.xml")

;; With options
(consume-http {:from             "https://stackoverflow.com/feeds/tag?tagnames=clojure" 
                                                      ; <http url string>, URL, File, InputStream
               :headers          {"X-Header" "Value"} ; Request's HTTP headers map
               :lenient          true                 ; Indicates if the charset encoding detection should be relaxed
               :default-encoding "US-ASCII"           ; Supports: UTF-8, UTF-16, UTF-16BE, UTF-16LE, CP1047, US-ASCII
               ... 
               + All options applied to a (consume) call.
              })
Option Type Default Description
:from String URL, java.net.URL, File, InputStream required URL or source to fetch
:headers map {} HTTP headers (e.g., {"User-Agent" "MyApp"})
:lenient boolean true Relaxed charset encoding detection
:default-encoding String "US-ASCII" Fallback encoding: UTF-8, UTF-16, UTF-16BE, UTF-16LE, CP1047, US-ASCII
:content-type String nil Override Content-Type header (used with InputStream)

Beware! consume-http from either http url string or URL is rudimentary and works only for simplest cases. For instance, it does not follow HTTP 302 redirects. Please consider using a separate library like clj-http or http-kit for fetching the feed.

produce

Generate RSS/Atom feed as string, file, or DOM.

(produce {:feed            {:info {:feed-type "atom_1.0" ; Supports: atom_1.0, atom_0.3, rss_2.0, 
                                                         ; rss_1.0, rss_0.94, rss_0.93, rss_0.92, 
                                                         ; rss_0.91U (Userland), rss_0.91N (Netscape), 
                                                         ; rss_0.9
                                   :title "Feed title"}
                            :entries [{:title       "Entry 1 title"
                                       :description {:value "entry description"}}]
                            :foreign-markup nil}

          :to              :string ; <file path string>, :string, :w3cdom, :jdom, File, Writer
          :pretty-print    true    ; Pretty-print XML output
          :throw-exception false   ; false - return map with an exception, throw an exception otherwise
         })
Option Type Default Description
:feed map nil (uses argument as feed) Feed data structure to generate
:to :string, :w3cdom, :jdom, String (file path), File, Writer :string Output destination
:pretty-print boolean true Pretty-print XML output
:throw-exception boolean false If false, return error map; if true, throw exception

shrink

Remove nil values and empty collections from feed data.

(shrink feed)

License

Copyright Β© 2018-2026 Aleksei Sotnikov

Distributed under the MIT License