Skip to content

Zorbage: algebraic data types and algorithms for use in numeric processing.

License

Notifications You must be signed in to change notification settings

bdezonia/zorbage

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Zorbage: algebraic data types and algorithms for use in numeric processing

Developer Info:

How to include zorbage in your Maven project

  Add the following dependency to your project's pom.xml:
  
  <dependency>
    <groupId>io.github.bdezonia</groupId>
    <artifactId>zorbage</artifactId>
    <version>2.0.3.1</version>
  </dependency>
  
How to include zorbage in a different build system

  See https://search.maven.org/artifact/io.github.bdezonia/zorbage/2.0.3.1/jar
  for instructions on how to reference zorbage in build systems such as
  Gradle or others.

Project Goals:
  - provide a framework for reusable numeric algorithms
  - support numeric computing in Java in an efficient manner
    - provide code easier to develop and almost as fast as C++
    - provide code that is faster and less error prone than Python/R/Matlab
  - support very large data sets in an efficient manner
  - break limitations of many programming languages in terms of the 
      computable types provided and the extensibility of such types
  - do all this with a powerful set of simple abstractions

Contains support for:
  - integers, rationals, reals, complex numbers, quaternions, octonions
  - numbers, vectors, matrices, and tensors
  - various precisions: 1-bit to 128-bit to unbounded (signed and unsigned and float)
  - very large datasets (arrays, virtual files, sparse structures, JDBC storage)
  - generic programming
  - algebraic/group-theoretic algorithms
  - procedural java with object oriented comforts
  - the definition of your own types while reusing existing algorithms

Can you show me what Zorbage can do?

  For some overviews see:

    https://github.com/bdezonia/zorbage/tree/master/src/main/java/example

  Once you've covered that if you have more questions look at Zorbage's
    extensive test code at:

    https://github.com/bdezonia/zorbage/tree/master/src/test/java/nom/bdezonia/zorbage

  To see a few small stand alone programs have a look here:

    https://github.com/bdezonia/zorbage/tree/master/example

Thanks, I'll look that up later. Can you describe what can I do with Zorbage?

  Define multidimensional data sets with flexible out of bounds data
    handling procedures. Zorbage has an excellent abstraction for multi-
    dimensional data that is easy to understand and use and is at the same
    time quite powerful.

  Use data views of your multidimensional data source to very rapidly set
    and get values. Data views allow one to write any sample visiting
    algorithm desired. The views are especially good with nested for loop
    approaches to iterating through data. Their speed is comparable to
    direct 1-d array access. Pull out arbitrary planes of your multi-
    dimensional data set with ease.
        
  Use many different data types (133 and counting)

    - signed integers (bits: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
                           14, 15, 16, 32, 64, 128, unbounded)

    - unsigned integers (bits: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
                           14, 15, 16, 32, 64, 128)

    - rational numbers (unbounded)
    
    - floats (bits: 16, 32, 64, 128, unbounded)

    - gaussian integers (bits: 8, 16, 32, 64, unbounded)
    
    - complex numbers (bits: 16, 32, 64, 128)
    
    - quaternions (bits: 16, 32, 64, 128)
    
    - octonions (bits: 16, 32, 64, 128)
    
    - Unicode16 characters and variable length strings and fixed length strings
  
    - booleans
    
    - n-dimensional real Points
    
    - ARGB, RGB, and CIE LAB tuples
    
    - vectors and matrices and tensors made of the various types
  
  Define your own data types
  
    Do you have a custom data type? (Like RNA base pairs?). Can you define operators
      between elements? (Like when two base pairs are equal?). Then you can define
      their algebra and reuse any zorbage algorithms that accept similarly defined
      types.
  
  A conversion api exists for moving between types accurately and efficiently.
  
  There are types that are compound: complex numbers, quaternions, and
    octonions based on any of the floating types.
    
  Types can be stored in arrays, files, sparse structures, and JDBC
    database tables.
    
  You can allocate and use huge (length up to 2^63 elements) data structures.
  
  Data access revolves around 1-d lists that act like arrays. Arrays can be
    concatenated, trimmed, subsampled, masked, padded, readonly, as well as
    other abstractions. At the heart, each multidim data source has one of
    these arrays behind it. These arrays are not indexed by integers but
    instead by longs and break many of the limitations of Java arrays.
      
  Array storage can be in native types (for speed of access) or for many
    integer types they can be bit encoded (to save space).
  
  You can use existing or write your own generic algorithms that work with
    all the types transparently. For instance Zorbage has one Sort algorithm
    that can sort a list made of any of the above defined types while doing
    no data conversions.
  
  You can use existing or write your own algorithms that work with numbers,
    vectors, matrices, and tensors. Zorbage includes 100's of predefined
    algorithms too. It can find roots, find derivatives, and solve differential
    equations numerically involving arbitrary scalar and vector functions and
    procedures. It also includes algorithms from linear algebra, signal
    processing, statistics, set theory, and analysis.
    Algorithms include:
    - Basic vector and matrix and tensor operations
    - Transcendental functions of various precision floats and matrices
    - Basic runge kutta ode solver
    - FFT and Inverse FFT
    - Convolutions and Correlations
    - Resampling algorithms
    - LU Decomposition and LU Solving (simultaneous equation solving)
    - Basic statistical functions
    - Most C++ STL algorithms replicated
    - Numerous parallel algorithms are provided for quickly processing data.
  
  Define complex data sampling algorithms from prebuilt sampling components.
  
  Use type safe first class Function and Procedure objects. Pass them as
    arguments to code that can transform your data quickly and generically.
    Write methods that return Functions and Procedures if needed. Compute
    values from user defined Functions and Procedures.
 
  Use parsers to create Procedures from strings. These Procedures represent
    equations that when fed values will compute a return value (the result of
    applying the inputs to the equation). Equations can return numbers, vectors,
    matrices, or tensors. The subcomponents of these can be reals, complexes,
    quaternions, or octonions. Equations can be built out of numbers, input
    variable references, constants (like E, PI, etc.) and typical functions
    like sin(), atan(), exp(), log(), etc. For more info see:
    https://github.com/bdezonia/zorbage/blob/master/EQUATION_LANGUAGE

Why Java?

  - numerical computing in Java is under represented
  - with Java there is no need to worry about memory or pointer bugs
  - the JRE optimizes many object allocations into stack allocations
  - Java is portable: Zorbage runs everywhere the JVM does
  - Java is safer at runtime than C++ while still supporting good performance
  - Java is faster at runtime than Python/R/Matlab while also avoiding simple
      "oops, that typo in my code just killed my long run" situations

Why is Zorbage the way it is?

  - procedural programming is efficient
  - object oriented comforts can be used where needed
  - generics allow for reusable algorithms. often one can write one
      algorithm and use it for reals, complex numbers, quaternions,
      matrices, etc.
  - there is no telling what one bored developer can get up to when
      looking for fun
  
Zorbage breaks Java barriers
  
  - in Zorbage "arrays" are indexed by 64-bit long integers rather
      than 32-bit integers. Relatedly data lists sizes can reach
      2^63.
      
  - in Zorbage, due to JVM optimizations, you can pass primitives by reference

  - supports multidimensional arrays
      
  - breaks the limitation to integer types being in byte aligned sizes
    
  - supports unsigned numbers
    
  - supports 128-bit signed and unsigned integers
    
  - supports 128-bit floating point numbers
    
  - supports 16-bit floating point numbers
    
  - supports high precision floating point numbers
    
  - supports unbounded ints
    
  - supports (unbounded) rational numbers
    
  - builds reals, complexes, quaternions, and octonions from any
      of the floating types. You can write one algorithm and substitute
      one runtime parameter to calculate in 16 bit, 32 bit, 64 bit, 128
      bit or seemingly unbounded accuracy.
    
Zorbage is fast
  
  - Zorbage can run numeric algorithms in speeds comparable to C++
    
  - Some multithreaded algorithms are provided
      - matrix multiplies
      - resampling
      - convolutions/correlations
      - fills
      - transforms
      - data conversions
      - apodizations

Zorbage is flexible
  
  - supports n-dimensional data sets with flexible out of bounds data
      handling procedures
      
  - the n-dimensional data sets can use the equational language to
      carefully and powerfully calibrate their axes.

Is there a way to get data into a Zorbage backed application?

  Code has been written to load ECAT scan data using the zorbage
    ecat library into Zorbage structures. You can find it here:
    
    https://github.com/bdezonia/zorbage-ecat

  Code has been written to load Java audio data using the zorbage
    jaudio library into Zorbage structures. You can find it here:

    https://github.com/bdezonia/zorbage-jaudio

  Code has been written to load raster GIS data using the GDAL library
    into Zorbage structures. You can find it here:
  
    https://github.com/bdezonia/zorbage-gdal

  Code has been written to load raster GIS data using the NetCDF library
    into Zorbage structures. You can find it here:
    
    https://github.com/bdezonia/zorbage-netcdf

  Code has been written to load Nifti scan data using the zorbage
    nifti library into Zorbage structures. You can find it here:
    
    https://github.com/bdezonia/zorbage-nifti

  Code has been written to load NMR data using the zorbage NMR
    library into Zorbage structures. You can find it here:

    https://github.com/bdezonia/zorbage-nmr

  Code has been written to load common raster data formats using
    the SCIFIO library into Zorbage structures. You can find it here:
  
    https://github.com/bdezonia/zorbage-scifio
  
Is there a way to view Zorbage data?

  Coding is underway on a data viewer. It is available now (alpha ready).
  You can find it here: https://github.com/bdezonia/zorbage-viewer

Programming notes

  Java 11 info
    
    Zorbage has been compiled and tested using Maven with OpenJDK Java 11 on Linux

  Java 8 info

    Zorbage was previously compiled and tested using Maven with OpenJDK Java 8 on Linux

      The Zorbage code has been scrubbed of API calls later than Java 8 and the
      Zorbage library API should be compilable with Java 8.

  Java 7 info

    At one time Zorbage was compiled and tested using Maven with Oracle Java 7 on the
      Macintosh. Recently the Zorbage code was updated to take advantage of some Java 8
      features so it is possible it will not be callable from Java 7. YMMV

  Other platforms
    
    Zorbage has been compiled and tested with many versions of Eclipse and some versions
    of IntelliJ.

Acknowledgements

  Thank you Dr. Pepper for your timely and generous contributions to the
  project. Zorbage would not be the same without you.

Partial bibliography

  Books:
  
  A Book of Abstract Algebra; Pinter
  A Student's Guide to Vectors and Tensors; Fleisch
  Advanced Engineering Mathematics; Wylie, Barrett
  Algorithms for Computer Algebra; Geddes
  Basic Partial Differential Equations; Bleecker
  Calculus and Analytic Geometry; Thomas, Finney
  Compilers; Aho, Sethi, and Ullman
  Complex Variables and Applications; Brown, Churchill
  Complex Variables with Applications; Wunsch
  Computer Algebra and Symbolic Computation (2 vols); Cohen
  Design Patterns, gang of four
  Differential Equations; Ross
  Differential Equations and Linear Algebra; Edwards and Penny
  Digital Image Proccessing; Burger
  Digital Image Processing; Gonzalez, Woods
  Digital Signal Processing; Proakis
  Discrete Time Signal Processing; Oppenheim and Schaffer
  Div, Grad, Curl, and All That
  Elements of Modern Algebra; Gilbert
  Elements of Programming; Stepanov
  From Mathematics to Generic Programming; Stepanov
  Functions of Matrices; Higham
  Handbook of Math and Computational Science; Harris, Stocker
  Haskell School of Expression
  Haskell: the Craft of Functional Programming; Thompson
  Introduction to Abstract Algebra; Dubisch
  Introduction to Algorithms; Cormen, Lesserson, Rivest, Stein
  Introduction to Computer Simulation Methods; Gould
  Introduction to Octonion and Other Non-associative Algebras in Physics; Okubo 
  Linear Algebra and Analytic Geometry for Physical Sciences; Landi, Zampini
  Matrix Algebra; Robbin
  Modern Mathematical Methods for Physicists and Engineers; Cantrell
  NMR Data Processing; Hoch and Stern
  Numerical Methods for Engineers; Chapra
  Numerical Methods for Engineers and Scientists; Gilat
  Numerical Methods for Scientists and Engineers; Hamming
  Numerical Recipes; Press etal
  Partial Differential Equations for Scientists and Engineers; Farlow
  Probability and Statistics for Engineering and the Sciences; Devore
  Programming Clojure; Miller, Halloway, Bedra
  Real World Haskell
  Rings, Fields, and Groups; Arnold
  Schaum's Outline of Tensor Calculus; Kay
  Scientific Computing With Python; Fuhrer, Solem, Verdier
  Structure and Interpretation of Computer Programs; Abelson
  Tensors Made Easy; Bernacchi
  Tensor Calculus Made Simple; Sochi
  The Art of Computer Programming; Knuth
 
  Online manuals:
  
  GNU Scientific Library documentation
  Boost Library documentation
  C++ STL documentation
  Mathematica documentation
  Matlab documentation
  R documentation
  Julia documentation
  Haskell documentation
  Ruby documentation
  Pascal language definitions

  Online articles:
  
  Many articles on Mathworld and Wikipedia and StackExchange
  Many other web pages and pdfs and slide presentations

Paying it forward
  
  If you like Zorbage as a library or as a source of ideas then please
  visit my daughter's band's BandCamp page, listen to their music, and
  buy something if you like what you hear. Thanks.
  
    https://dearmrwatterson.bandcamp.com/