(Replicated below on 5/25/2021 for my comments and thoughts; no attempt at asserting ownership intended.)

The Grand Unified Programming Theory: The Pure Function Pipeline Data Flow with Principle-based Warehouse/Workshop Model

Copyright © 2018 Lin Pengcheng. All rights reserved.

Table of Contents

Key innovative ideas

My and Other Peoples Related Views

Keep it Simple and Unified.
        ---- Lin Pengcheng

NASA’s 10 rules for writing mission-critical code: 
1.Restrict all code to very simple control flow constructs.
        ---- Gerard J. Holzmann, NASA JPL lead scientist.
Minimize control flow complexity and "area under ifs", 
favoring consistent execution paths and times over "optimally" avoiding unnecessary work.
        ---- John Carmack

Clojure Aphorism: A tangled web of mutation means any change to 
your code potentially occurs in the large. 
        ---- The Joy of Clojure (2nd Edition, Chapter 10)
Bad programmers worry about the code. 
Good programmers worry about data structures and their relationships.
        ---- Linus Torvalds
Data dominates. If you’ve chosen the right data structures and organized things well, 
the algorithms will almost always be self-evident. 
Data structures, not algorithms, are central to programming. 
        ---- Rob Pike
It’s better to have 100 functions operate on one data structure 
than 10 functions on 10 data structures.        
        ---- Alan Perlis
             the first recipient of the Turing Award (1966)
             A founding father of Computer Science as a separate discipline
Show me your flowcharts and conceal your tables, 
and I shall continue to be mystified. Show me your tables, 
and I won’t usually need your flowcharts; they’ll be obvious.
        ---- Fred Brooks, Turing Award (1999), The Mythical Man-Month
Even the simplest procedural logic is hard for humans to verify, 
but quite complex data structures are fairly easy to model and reason about. 
Data is more tractable than program logic. It follows that where you see a choice 
between complexity in data structures and complexity in code, choose the former. 
More: in evolving a design, you should actively seek ways to shift complexity from code to data.
        ---- Eric Steven Raymond, The Art of Unix Programming, Basics of the Unix Philosophy
Metaphors for a Richer Understanding of Software Development.
        ---- The most valuable chapter of "Code Complete": Chapter 2
Principles-based are better than rules-based.
        ----International Accounting Standards        


Using the input and output characteristics of pure functions, pure functions are used as pipelines.
Dataflow is formed by a series of pure functions in series.
A dataflow code block as a function, equivalent to an integrated circuit element (or board)。
A complete integrated system is formed by serial or parallel dataflow.

Can also be said, Data and logic are strictly separated,
Element level separation of data and logic, data stream processing.

(defn f [[evens odds total amax amin] x]
  (let [[evens odds] (cond 
                       (even? x) [(inc evens ) odds]
                       (odd? x)  [evens (inc odds)]
                       :else     [evens odds])
        total (+ total x)
        amax  (max amax x)
        amin  (min amin x)]   
     [evens odds total amax amin]))

(reduce f [0 0 0 ##-Inf ##Inf] [5 6 8 -3 -9 11 156 6 7])

;;[4 5 187 156 -9]

For me, programming is the process of designing a data model that is simple and fluent in manipulation.
More than 80% functions of my project is ->> threading macro code block,
each step is simple, verifiable, replaceable, testable, pluggable, extensible,
and easy to implement multithreading.
The clojure threading macro provides language-level support for PurefunctionPipeline&Dataflow.

The sea sails by the helmsman and the programming moves toward the data. Initial state, final state,
the shortest linear distance between two points. Simplicity is the root of fast, stable and reliable.

Those who are good at war have no surprising victory, no reputation for wisdom, no honor for courage.
        ---- Sun Wu, The Art of War
             Famous Chinese military and politician, 
             sage of military science, 
             ancestor of Eastern military science

The design philosophy of industrial pipeline and pure function pipeline data flow is the same.
In essence, they are: "Towards the goal, Step by step, every step moves forward to the final goal
until the final goal is reached." Therefore, its success is inevitable, not surprising, the process is a simple repetition.
After using this method proficiently, it is a simple and repeated boring technique.
This is the simplicity and repetition pursued by large industrial production lines.

Five basic pure function pipeline data flow components

1 Pipeline Component

Pipeline Component (forward flow)

Pipe functions are pure functions.
A ->> block function is equivalent to
an integrated circuit component (or board).

(defn f [x]
  (->> x
(defn f [{:keys [x y] :as m}]
  (->> x
       (f1 y ,)

; R language style functions: 
; - multiple named parameters
; - the parameters can be out of order when calling the function
; - most of the parameters have default values

(def ^:pravite defa-opt-map {:a 0 :b 9})

(defn f [opt-map]
  (let [opt-map (merge defa-opt-map opt-map)
        {:keys [a b c]} opt-map]
    ; doing something
    [a b c opt-map]))

(f {:a 3 :c 15})
;=> [3 9 15 {:a 3, :b 9, :c 15}]  

;opt-map can provide both unix and windows 
;style parameters at the same time, and the 
;performance loss is negligible.
;dos style copy
(def ^:private defa-opt-map
  {:src  ""   :dest ""
   :A    nil  :B    nil
   :D    nil  :L    nil
   :V    nil  :N    nil  
   :Y    nil  :Z    nil
   :target nil})
(defn copy 
    (->> (merge defa-opt-map opt-map)
         ;do sth.
  ([src dest]
    (->> (assoc defa-opt-map 
                :src src 
                :dest dest)

2 Conditional Branch

A (cond) or (if) block as a function.

(defn f [x]
    (= x 1) (f1)
    (= x 2) (f2)
    :else   (f3)))
(defn f2 [x y]
  (-> (> x 2)
      (and , (< y 6))
      (if , 25 30)))
(defn path-combine [s1 s2]
    (string/starts-with? s2 "/") 
    (not (string/ends-with? s1 "/"))
      (-> (string/split s1 #"[\\/]")
          (#(string/join "/" %))
          (str , "/")
          (path-combine , s2)) 
      (-> (string/join "/" [s1 s2])
          (string/replace ,  #"[\\/]+" "/")))) 

3 Feedback Circuit

Feedback circuit (reflow, whirlpool, recursive):
A tail recursive function is equivalent to a feedback circuit.

Note: The map is batch processing. it can be regarded as similar to a queue of tourists.
Repeating the ticket checking action at the entrance is a forward action,
not feedback or reflow.

(defn f [i]
  (if-not (zero? i)
    (-> i dec recur)))

4 Shunt

For example: data partitioning, parallel processing

(->> data
     (partition n ,)
     (pmap f ,))
(->> [pipe-f1 pipe-f2 pipe-f3]
     (pmap #(% data) ,))

5 Confluence

Confluence(reduce): reduce the result of the shunt

(->> data
     (partition n ,)
     (pmap f1 ,)
     (reduce f2 ,))   

Code example

Code example 01

;Traditional expression, chaotic logic, unreadable.
(if (and (> x1 x2)
         (or (< x3 x4) 
             (and (or (> y1 y2) 
                      (< y3 y4))
                  (not= x5 x6)))
         (keyword? x7)) 

;Pure Function Pipeline Dataflow
;Unrestricted expression, just read in order. 
;Closer to the order of execution of the machine.
(->  (> y1 y2)
     (or  , (< y3 y4))
     (and , (not= x5 x6))
     (or  , (< x3 x4))
     (and , (> x1 x2))
     (and , (keyword? x7))       
     (if  , :t :f))

Code example 02

(def data
  {:a [[:b :c :d]
       [:e :f :g]
       [:h :i :j]]
   :k [[:l :m :n]
       [:o :p :q]
       [:r :s :t]]})

(defn f1 [[k v]]
  (let [[h & t] v
        f   (fn [x] (mapv #(vector :td %) x))
        tds (map #(->> % f (into [:tr] ,)) t)]
     (->> (f h)
          (into [:tr [:td {:rowspan (count v)} k]] ,)
          (conj tds ,))))

(->> data
     (reduce #(->> %2 f1 (into %1 ,)) [:tbody] ,)
     (conj [:table] ,)
; hiccup DSL

    [:tr [:td {:rowspan 3} :a] 
         [:td :b] 
         [:td :c] 
         [:td :d]] 
    [:tr [:td :e] 
         [:td :f] 
         [:td :g]] 
    [:tr [:td :h] 
         [:td :i] 
         [:td :j]] 
    [:tr [:td {:rowspan 3} :k] 
         [:td :l] 
         [:td :m] 
         [:td :n]] 
    [:tr [:td :o] 
         [:td :p] 
         [:td :q]] 
    [:tr [:td :r] 
         [:td :s]
         [:td :t]]]]     

HTML Table:

a b c d
e f g
h i j
k l m n
o p q
r s t

Code example 03

See also:
- Implement relational data model and programming based on hash-map (NoSQL)
- Babashka Script: Notepad++ Markdown Literary Programming with live preview for Clojure that don't break the syntax of any programming language
- Babashka Script: Notepad++ Edit Clojure hiccup (HTML DSL) with live preview

Classical Model

The true sign of intelligence is not knowledge but imagination (analogy).
      ---- Albert Einstein

Analogy is an application of algebraic thinking. An analogy may not be established in the real world,
but it must be established in the virtual software world,
it is very easy to map an old model to a new model and upgrade according to the standard specifications of the new model.
In management, it is called "merger by absorption".

This is a typical application of the philosophy of the Tao and the Grand Unified Theory.

Gantt chart


Warehouse/Workshop Model

Warehouse Workshop Model

Microservice Architecture

Overview of the model

It is an independent unit that performs tasks within the system.
each workshop is an independent pipeline with a single function.
they are small, simple and clear.

If you understand that software is a factory that produces data,
then you can understand the warehouse/workshop model,
understand the separation of production lines (pure functions, pipelines, workshop) and production materials (data),
and continuous production on the production line forms a data flow.

The basic principle of the system is similar to an in-memory database system.
like the in-memory database, all tasks of the system are completed by scheduling stored procedure (workshop),
and all side effects (similar to persistence, distributed, etc.) are completed by the in-memory database.

Principles of the model

Division of tasks

Single Leader and Unified Scheduling

Empowerment and Management by Objective

Single form

Concentration and decentralization

Level chain




Out-of-order execution is a product of
wrong programming methodology,
wrong computer architecture and
weak compiler conditions.

In the "warehouse/workshop model",
the workshop is an orderly and high-speed ray(pipeline).
The warehouse scheduling function performs
dynamic planning and unified scheduling
for all workshops and resources,
without conflict and competition,
and runs in the optimal order and efficiency.


Exception handling

Framework code of the model

;workshop is pipeline(pure function)
;It is run after the scheduler allocates the initial data (parameter), 
;and its output data (return value) is also "received" and "processed" by the scheduler.
(defn workshop [init_data]
  (->> init_data

(def warehouse (atom {}))

(defn scheduler [key reference old-state new-state]
  ;1. According to the new status (such as orders, etc.) 
  ;   scheduling workshops to complete tasks.
  ;2. Side effects: 
  ;   2.1. Interact data with other warehouses as needed 
  ;        (distributed,  other databases, disk, etc.)
  ;   2.2. persist data, etc.

(add-watch warehouse :scheduler scheduler)

Everything is unified

The unification of programming technology and system architecture

Programs = Algorithm + Data Structures
      ---- Niklaus Wirth, Turing Award (1984), Father of Pascal

It’s better to have 100 functions operate on one data structure 
than 10 functions on 10 data structures.        
      ---- Alan Perlis
           the first recipient of the Turing Award (1966)
           A founding father of Computer Science as a separate discipline

The unification of single-threaded and multi-threaded and asynchronous and distributed

"Fire-and-Forget" is a guidance bullet with independent guidance capability. It does not need external support,
it will automatically track and strike the target, and do not need to control after launching.
The utility model has the advantages of improving the use efficiency between the missile and the launcher,
and reducing the missile's dependence on other systems to provide its own updated information,
so that the launcher can attack the largest number of targets in the shortest time and improve the survival of the launcher.
The development direction of guidance technology in the future is precisely the "Fire-and-Forget" precision guidance technology.

For the same reason, I think the development direction of concurrent and parallel programming technology is also "Fire-and-Forget",
Change from focusing on "code and function development" to "data control, data flow management, data lifecycle management,
data standardization system, process improvement (process reengineering), thread collaborative optimization, etc."

Existing pragmatic "Fire-and-Forget" concurrency and parallel technology:
software transactional memory (STM), multi-versioned concurrency control (MVCC), git.
Compared with them, the advantage of the warehouse/workshop model is that
all tasks and resources are dynamically planned and globally deployed by the scheduling function,
so there is no resource competition and transaction (version) conflicts,
and optimal efficiency can be achieved.

async/await, Project Loom fiber, Gantt Chart, and Scientific Management

Asynchronous is unnecessary, async/await is a backward
and inevitably eliminated model.

From the perspective of Operations Research,
async/await should be abolished. When waiting,
this thread should be over. For example, in a factory,
it will not happen that one workshop stops in the middle
of the production process and waits for its products
at the door of another workshop to continue the production process.
Each workshop only interacts with the warehouse.
After the main thread (also the workshop) sends out order data,
The production plan is generated by the warehouse,
and data (messages) are sent to the relevant workshops
for production until the task is completed.
There is no waiting in the entire process,
but the production plan is generated according to the order,
with the warehouse as the center,
and each workshop independently produces in parallel.

Therefore, asynchronous technology
from the perspective of management,
people who understand the situation know that
they are waiting for raw materials;
people who do not understand the situation may mistakenly
think that they are on strike. But in any case,
it wastes resources greatly.

If there is waiting in an operation
(thread, asynchronous thread, fiber),
it must be because the operation design or
the proportion of system resource allocation is unscientific.
If there are tasks (data) that are not completed,
all workers (threads, asynchronous threads, fibers)
are not allowed to wait. This is the most basic requirement
of scientific management.

Scientific Management
is committed to improving efficiency through Operations Research.
It divides operations into indivisible monotonous operations
(threads, asynchronous threads, fibers),
and then designs the optimal combination of operations
based on resources to achieve maximum efficiency.
Among them, waiting is not allowed within an operation
(thread, asynchronous thread, fiber),
which is the most basic requirement,
and this approach is also the most convenient
for overall coordination and optimization.
The most important design tool is the Gantt chart.
The best implementation method is the warehouse/workshop model implemented by the factory,
which is also the principle of ForkJoinPool
(the basic technology of fiber),
but the ForkJoinPool's designer did not realize this
and did not provide guidance in the ForkJoinPool's user guide.

In the "Gantt Chart", there is no waiting inside a task (a bar in the chart. Thread, fiber),
and all waiting is global. When waiting, the task (a bar in the chart. Thread, fiber) ends.
When the resource is obtained to continue working, it is already a new task (a bar in the chart. Thread, fiber) .
"async/await" has a wait inside a task (a bar in the chart. Thread, fiber), which is completely wrong.
"async/await" completely does not conform to the most basic principles of "Operations Research Science"(ref01: wiki,
I don't think the unscientific model can produce higher efficiency.

In the Gantt chart, the waiting point divides a large task into many smallest independent small tasks. A small task is a bar, a bar is a workshop, a workshop is a pipeline, and a pipeline is a pure function or equivalent pure function.

In addition to obtaining input parameters from the warehouse at the beginning and submitting output data to the warehouse at the end of the workshop, the workshops are independent of each other, and the workshop has nothing to do with the external environment. They do not need to know whether there is waiting or whether there is a previous step or a next step.

In this model, the system dispatch center (warehouse) can safely arrange the order of completion of tasks with the optimal algorithm.

This's a Manufacturing Execution System (MES).

"async/await" is just an unorganized, undisciplined, imprecise, and unsafe practice.

The most typical case:
Amazon actually used AI to monitor and dispatch employees,
which was inefficient and fired on the spot.

This method will not run out of resources (threads)
or eventually block, because the warehouse
(+ dispatch center = DBMS) will arrange the maximum number of
workers (threads, asynchronous threads, fibers) in the system
according to the optimal component (data) production ratio
Quantity for production.

They are independent and do not interfere with each other.
They are only responsible for doing their own work.
There is no need to bother to observe and wait for resources
during production. The warehouse (+ dispatch center = DBMS)
is responsible for it.

This is consistent with the basic principle of ForkJoinPool.

There is no direct relationship between fibers and asynchrony.

A fiber is made of two components — a continuation and a scheduler. As Java already has an excellent scheduler in the form of ForkJoinPool, fibers will be implemented by adding continuations to the JVM.

ForkJoinPool uses the warehouse/workshop model and the scientific management of operations research, which has been mentioned above.
My algorithm is similar to "Project Loom", which is equivalent to the data-driven version of "Project Loom".

Fiber is more like an uber driver,
not an uber employee (system thread).
It does not need to bear the minimum wage,
paid sick leave and unemployment benefits and other benefits,
and there is no cost. Therefore,
uber can almost treat the uber driver (fiber)
as an unlimited resource,
and dispatch the uber driver (fiber) to complete the task.


I just object to waiting in a thread,
thinking that the waiting point is the natural boundary
of the thread, and the thread should be terminated
when there is a wait. The processing after the
waiting point is solved by the dispatch center
issuing new threads as needed.

The unification of Microservice and Intelligent-thread

Reference: The unification of single-threaded, multi-threaded, asynchronous and distributed, Every Intelligent-thread is a microservice.

Microservice Architecture

The unification with Information System Integration Model

Many large enterprises have independent information systems produced
by different manufacturers and need to integrate and integrate.

The unification with Microkernel Architecture

The unification with AOP

Similar to in an industrial zone, there is a global professional sewage treatment plant,
the input sewage is treated separately.

The unification with Event-driven Architecture

The unification with Computer Hardware Architecture


Computer Hardware Architecture

Computer hardware is also a factory that produces data,
so it can also apply the "warehouse/workshop model",
The model uses memory as the core, not the CPU.
Finally, we can achieve the grand unification of all IT fields
such as hardware, software, Internet, and Internet of Things.

The out-of-order execution technology of modern CPUs is a mistake on February 16 2021

The out-of-order execution technology of modern CPUs is a mistake (February 16, 2021)

Out-of-order execution is a product of
wrong programming methodology,
wrong computer architecture and
weak compiler conditions.

In the "warehouse/workshop model",
the workshop is an orderly and high-speed ray(pipeline).
The warehouse scheduling function performs
dynamic planning and unified scheduling
for all workshops and resources,
without conflict and competition,
and runs in the optimal order and efficiency.

Follower Case Apple M1 chip

My computer hardware architecture design was published on February 06, 2019.
One or two years later, the Apple M1 chip adopted the "warehouse/workshop model" design
and was released on November 11, 2020.

> there's also a new unified memory architecture
> that lets the CPU, GPU, and other cores exchange information
> between one another, and with unified memory,
> the CPU and GPU can access memory simultaneously
> rather than copying data between one area and another.
> Accessing the same pool of memory without the need
> for copying speeds up information exchange
> for faster overall performance.
> reference: Developer Delves Into Reasons Why Apple's M1 Chip is So Fast

Forecast on 2021-01-19

Forecast(2021-01-19): I think Intel, AMD, ARM, supercomputer, etc. will adopt the "warehouse/workshop model"

In the past, the performance of the CPU played a decisive
role in the performance of the computer. There were few
CPU cores and the number and types of peripherals.
Therefore, the CPU became the center of the computer
hardware architecture.

Now, with more and more CPU and GPU cores, and the number
and types of peripherals, the communication, coordination,
and management of cores (or components, peripherals)
have become more and more important, They become a key
factor in computer performance.

The core views of management science and computer science
are the same: Use all available resources to complete the
goal with the highest efficiency. It is the best field of
management science to accomplish production goals through
communication, coordination, and management of various
available resources. The most effective, reliable, and
absolutely mainstream way is the "warehouse/workshop model".

Only changing the architecture,
not changing or only expanding the CPU instruction set,
not only will not affect the CPU compatibility,
but also bring huge optimization space.

So I think Intel, AMD, ARM, supercomputing, etc. will adopt the "warehouse/workshop model",
which is an inevitable trend in the development of computer hardware.
My unified architecture and programming methodology will be vigorously promoted by these CPU companies,
sweeping the world from the bottom up.

Finally, the "warehouse/workshop model" will surely replace the "von Neumann architecture"
and become the first architecture in the computer field,
and it is the first architecture to achieve a unified software and hardware.

HPE Cray Supercomputer likes it at twitter in 2021-04
Anyone can build a fast CPU. The trick is to build a fast system." 
       ---- Seymour Cray, the father of supercomputing

Note: 2021-04-24 and 2021-04-27, HPE Cray Supercomputer likes it at twitter. It means:
After the Apple M1 chip adopts it, it will continue to enter the field of supercomputers.
My fast system is The Grand Unified Programming Theory: The Pure Function Pipeline Data Flow with Principle-based Warehouse/Workshop Model.

The unification with Integrated Circuit System

The unification with Programming Language Platform

Like julia, a lisp is built into the internal core or internal representation,
and the popular grammar is used externally. Therefore,
the grammar is not a problem at all, and the compiler and the grammar
can evolve independently, go hand in hand, freely, efficiently, and flexibly.
You can do several languages at the same time,
such as Julia native support for julia and lisp syntax,
and third-party implementation of clojure syntax,
performance equivalent to native grammar.
Converting clojure grammar to lisp grammar is simpler than
native julia grammar conversion. Implementing a language on a platform
represented by a lisp is very simple, such as Racket.
Regardless of which one of the developers likes swift, python, ruby,
scala, f# and java, all of them are implemented separately,
all of them are satisfied.

The unification with Clojure Web Application Model

Therefore, I recommend functions with single hash-map type parameters.
This parameter can be mapped to standards, datatable,
database (with constraints, stored procedures, schemas, etc.) as needed.
The Clojure’s immutable persistent data structure does not cause data cloning,
which is suitable for this scene.

The unification with Lifecycle Management

Algorithms derived from Chinese myths that have been circulating for thousands of years:
The book of life and death in hell.

The unification with classic AI and modern AI and explainable AI technology

Explainable AI System

2021-04-30, It is described succinctly and completely that the method of AI application warehouse/workshop model written many years ago:

Robots and transformers

The unification with energy system

The unification with modern economic and social operating system

The unification with Other Models

Warehouse/Workshop Model Summary

In industry, the product standard is the interface,
the production method (code implementation) is not limited,
input the raw materials (data) that conform to the standard,
and output the products (data) that conform to the standard,
that’s all.

Before entering the warehouse,
all data must be acceptance-checked first.

The input and output of the workshop
can only be standardized data,
the input-data comes from the warehouse,
the output-data is acceptance-checked
and sent to the warehouse.

Therefore, there will be no
abnormal/error/Illegal data inside the workshop,
no need to check data.

The code in the workshop is a pure function pipeline data flow,
it's simple, reliable, high-performance,
Easy to debug, easy to observe, easy to maintain, easy to expand.

The workshop and the workshop are independent,
non-interactive, Like a Lego module or
a ship's watertight compartment,
internal changes or abnormalities
in any one workshop do not affect other workshops.

The difference between it and others

Disadvantages of FP and OO

The chief forms of beauty are order and symmetry and definiteness, 
which the mathematical sciences demonstrate in a special degree.
        ---- Aristotle, "Metaphysica"

Only The Pure Function Pipeline Data Flow with Warehouse/Workshop Model perfectly meets the requirements.
It is the best example of the beauty of programming.

Object-oriented and functional programming completely do not meet these three beauty requirements.
The strange shapes and chaotic logic of OO&FP are not only unsightly, difficult to read and understand,
and completely incompatible with the simple and repeatable requirements of industrial production.

FP and OO are overly complicated, and it is not feasible in large industries. It is also a kind of production method that emphasizes personal technology in hand workshops. Personal technology greatly affects product quality and extremely unreliable production methods.FP and OO are actually taking a detour, highly embellished and ineffectual, and produce all kinds of fail.

Excessive application of OO and FP design patterns, in addition to increasing complexity
and error probability, reduce performance, without any benefit.
Complex networks of relationships between objects in the OO system are also difficult to maintain.

The type system advocated by FP uses the HM type system to derive the types of parameters
and return values in order not to write types by hand.
This approach violates my programming aesthetic "definiteness",
It is a big mistake that it creates a complex type system only to avoid doing one correct and simple thing.
I think the type of handwriting is still too simple. My method is to refer to manufacturing practices,
treat software as a factory that produces data, use data as the center,
precisely define data (products, parts, raw materials) standards,
and produce according to standards, everything It's all very simple.

I tend to construct systems with the simplest concepts and the most basic techniques, syntax, and functions.
Used to implement my mind, The Pure Function Pipeline Data Flow is the simplest, stable, reliable and readable.
There is a great poet Bai Juyi in China. even illiteracy understands and appreciates his poetry.
I hope that my code can be understood by the junior programmer even in the most complicated system.

The difference between it Data-oriented Data-driven

See also: The difference between Dataflow, Data-oriented, Data-driven

The difference between it and Microsoft Azure DataFactory-DataPipelines Architecture

See also: The difference between Warehouse/Workshop Model and Microsoft Azure DataFactory/DataPipelines Architecture

The difference between it and Flow-based programming

The difference between it and middleware

The code looks similar, but the idea is essentially different.

I can't agree with the idea of middleware,
It is in conflict with the idea of integrated circuits.
In the circuit, the component (board) cannot be circulated,
only the data (current) can flow, which is the essential difference.

The difference between it and Rx

It is essentially different between it and Rx:

The difference between it and traditional unix-like pipe operator in FP language

Basic quality control

Basic quality control of pure function pipeline data flow. The code must meet the following three basic quality requirements before you can talk about other things. These simple and reliable evaluation criteria are enough to eliminate most unqualified codes.
- Function evaluation: Just look at the shape of the code (pipeline structure weight), and whether the function is a pure function.
- Dataflow evaluation: A data flow has at most two functions with side effects and only at the beginning and the end.
- System evaluation: Just look at the circuit diagram, you can treat the function as a black box like an electronic component.
- Code Quality Visualization:
- For Lisp languages, S expression is contour graph,
can be very simple transformation into contour map, or 3D mountain map.
- If the height of the mountains is not high, and the altitude value is similar,
it means that the quality of the code is good.
- For non-Lisp languages, you can convert the source code into an abstract syntax tree (AST),
and then into a contour map, or a 3D mountain map.

Programming Aesthetics

Simplicity, Unity, order, symmetry and definiteness.
      ---- Lin Pengcheng, Programming aesthetics
The chief forms of beauty are order and symmetry and definiteness, 
which the mathematical sciences demonstrate in a special degree.
      ---- Aristotle, "Metaphysica"

My programming aesthetic standards are derived from the basic principles of science.
Newton, Einstein, Heisenberg, Aristotle and other major scientists hold this view.

The aesthetics of non-art subjects are often complicated and mysterious,
making it difficult to understand and learn.

The pure function pipeline data flow provides a simple, clear, scientific and operable demonstration.

Simplicity and Unity are the two guiding principles of scientific research and industrial production.

In the IT field, only two systems fully comply with these 5 programming aesthetics:

The biggest advantage is that it makes the calculations reach the ultimate simplicity and unity,
so digital logic circuits are produced,
and then the large-scale industrial production methods of computer hardware are produced.

The out-of-order execution technology of modern CPUs is a mistake:


> if you can generate a certain design language,
> a certain symmetry among components,
> and if the different levels of abstraction reflect each other
> to an extent it's much easier to communicate intent
> to the implementers, and for the implementing teams
> to communicate with each other. If you have a fluid,
> coherent design strategy which you can boil down
> into reusable tactics you can build out any system
> of any size and complexity because it will take
> on a certain fractal shape that anyone within
> (even if they are only concerned with a tiny part)
> can easily and fully navigate and understand,
> and one glance is all it takes to know whether you're clean,
> or approaching a death star.

> of course, form should serve function, but I think that
> as humans we have evolved the concept of aesthetics
> as a subconscious, intuitive understanding of order.
> if we satisfy this intuition, we may be more certain that
> what we produced isn't a heap of chaos.

> Link


According to Taoism, water flow is the perfect substance. The water flow is always able to assume any shape as needed, sequential processing, until the mission is completed, reaching the end. The pure function pipeline data flow is like a water flow, almost the Tao.

Clojure just adds four persistent collections and some core functions to the JVM, and expresses the code with four persistent collections. It has no syntax, It can change as needed, like water flow, almost the Tao.

Tao is simplicity, Tao is the law of nature, Tao is algorithm, Tao is everything in everywhere on everytime.
Tao is the great unification of everything.
Therefore, Integrated Circuit Technology, Industrial Assembly Line Production Technology,
Accounting, Management, Architecture etc.
everything can be used as Algorithms and Software Engineering Methods.
They can transform each other.

In traditional Chinese culture, there is an unremitting pursuit of the ultimate grand unification (Tao).
There are countless people who know this concept, but in history, only a few people have achieved creative results.
They all have a strong imagination, creativity and understanding.
Everyone’s knowledge, experience, and interests are different.
Everyone’s Tao also has an iconic personal characteristics, that is:
only the personality is the Tao, similar to only the nation is the world.

This level of achievement is traditionally known as comprehending the true meaning of "Tao",
and Einstein called it "true wisdom".


Killer Application

Software Design and Develop Automation

Software Design and Develop Automation (SDDA)

Software and hardware design is less different than software designers think, 
but more different than hardware designers think.
        ---- Fred Brooks, Turing Award (1999), The Mythical Man-Month

The more simple and unified things,
the more suitable for large-scale industrial production.
Because binary makes calculations reach the ultimate simplicity and uniformity,
digital logic circuits are produced, and then a large-scale industrial production method of computer hardware is produced.

Therefore, if software is to be realized as large-scale industrial production as computer hardware,
software design and development is necessary to achieve the ultimate simplicity and unity.

Because the pure function pipeline data flow and the principle-based warehouse/workshop model
not only realize the ultimate simplicity and unification of software development,
but also make the software a simple and unified fractal system,
and realize the unification of software and hardware in the logical model,
Therefore, software can use computer hardware design and development methods for large-scale industrial production.
Therefore, it solves the problem of Fred Brooks.

Computer Hardware Architecture, Follower: Apple M1 chip

Explainable AI System use the law model and the Warehouse/Workshop Model

Great Historical Significance

Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove it.
      ---- Alan Perlis, Epigrams in Programming.
           the first recipient of the Turing Award (1966)
           A founding father of Computer Science as a separate discipline           

When the solution is simple, God is answering.
Everything should be as simple as possible, but not simpler.
Most of the fundamental ideas of science are essentially simple, and may, as a rule, 
be expressed in a language comprehensible to everyone.
If you can't explain it simply, you don't understand it well enough.
Any intelligent fool can make things bigger, more complex, and more violent. 
It takes a touch of genius -- and a lot of courage -- to move in the opposite direction. 
      ---- Albert Einstein
           The greatest folk scientist in history :-)
           A professional clerk in the patent office 
           An amateur physicist
           Nobel prize in Physics (1921)
Make folk sciences great again :-)
0. It is the first grand unified theory in the field of natural sciences.
1. Perfectly defeat other messy and complex software engineering methodologies 
   in a simple and unified way.
2. Realize the unification of software and hardware on the logical model.
   and the unification of programming technology and system architecture 
   through the innovative "Warehouse/Workshop Model".
   the "Warehouse/Workshop Model" will surely replace the "von Neumann architecture" 
   and become the first architecture in the computer field, 
   and it is the first architecture to achieve a unified software and hardware.
3. Achieve a leap in software production theory 
   from the era of manual workshops 
   to the era of standardized production in large industries.
4. The basics and the only way to `Software Design and Develop Automation (SDDA)`, 
   SDDA is an innovative and revolutionary approach to develop large-scale software,
   just like `Electronic Design Automation (EDA)`.
5. Defines the programming aesthetic standards as simplicity, Unity, order, symmetry and definiteness.
6. It is a particular outstanding and trend-setting technical achievement, 
   It fits perfectly with the principal claim to the "Turing Award".
   I think it should win the "Turing Award", the highest award in the computer field.
   If I cannot win the Turing Award, it must be that ACM lacks the ability to appreciate technology.
   History will prove what I said.2020-03-07
      ---- Lin Pengcheng, Self-taught folk scientist

The idea of simplicity and unity is an important guiding ideology of scientific research.
Unification of theories is the long-standing goal of the natural sciences;
and modern physics offers a spectacular paradigm of its achievement.
It can be found from the knowledge of various disciplines:
the more universally applicable a unified theory, the simpler it is,
and the more basic it is, the greater it is.
In addition, the more simple and unified things,
the more suitable for large-scale industrial production.

The Pure Function Pipeline Data Flow,
based on the philosophy of Taoism and the Great Unification Theory,
In the computer field, for the first time,
it was realized that the unification of hardware engineering and software engineering on the logical model.
It has been extended from Lisp language-level code and data unification
to system engineering-level software and hardware unification.
Whether it is the appearance of the code or the runtime mechanism,
it is highly consistent with the integrated circuit system.
It has also been widely unified with other disciplines
(such as management, large industrial assembly lines, water conservancy projects, power engineering, etc.).
It's also very simple and clear, and the support for concurrency, parallelism,
and distribution is simple and natural.

There are only five basic components:

  1. Pipeline (pure function)

  2. Branch

  3. Reflow (feedback, whirlpool, recursion)

  4. Shunt (concurrent, parallel)

  5. Confluence.

The whole system consists of five basic components.
It perfectly achieves unity and simplicity.
It must be the ultimate programming methodology.

In addition, the IT industry is still a very young and immature discipline.
The current software engineering is still at the level of manual workshops.
Pure function pipeline data flow considers that computer hardware and software
are both a factory that produces data. It unifies the architecture of computer hardware
and software into the architecture of the manufacturing industry ("warehouse/workshop model"),
It brings a next-generation architecture ("warehouse/workshop model") to computer hardware,
It has verified its excellent performance through the Apple M1 chip.
It brings large-scale industrial production theories and methods to software engineering.
It incorporates IT industry into modern large industrial production systems,
This is an epoch-making innovative theory and method.

The modern society is an information society, IT controls everything,
penetrates everything. In my opinion, the development of IT is exactly
the same as the development of modern large-scale industrial production
systems. With the development of the IT industry,
With the development of the IT industry, data standard systems will be
widely established, improved and interconnected at the international,
national, industrial and enterprise levels, It will be precisely standardized
to every smallest part. pure function pipeline data flow will become
the basic theory and Methods have become increasingly important,
and have become the ultimate standard method for entering textbooks and industry.

The key to the industrialization of the IT industry is to
establish a complete standard system like the traditional industry.
software is the pipeline for producing products (data),
which is no different from traditional industrial production lines.
Therefore, the software production method will adopt enterprise management ideas,
develop software into something similar to a traditional industrial assembly line,
input standardized raw materials (data), output standardized products (data),
and store them in a warehouse (database).

From the perspective of large industrial production theory,
standardizing the input raw materials (data) and output products (data)
has the following advantages:

The chief forms of beauty are order and symmetry and definiteness, 
which the mathematical sciences demonstrate in a special degree.
      ---- Aristotle, "Metaphysica"

Only The Pure Function Pipeline Data Flow with Warehouse/Workshop Model perfectly meets the requirements.
It is the best example of the beauty of programming.

The role of the standard system can be seen from
the great progress of social productivity after
the traditional industry has entered the era of
large industrial production from the era of manual workshops.

This method has been applied to 100,000 lines of code-level pure clojure project,
which can prove the practicability of this method.

Finally, If you agree with me, please help me nominate the "Turing Award".




Imagination is more important than knowledge.
The true sign of intelligence is not knowledge but imagination.
Logic will get you from A to B, imagination will take you everywhere.
        ---- Albert Einstein

Similar to The most valuable chapter of “Code Complete”----Chapter 2 Metaphors for a Richer Understanding of Software Development, I tend to inspire readers to discover useful patterns from life, work and personal interests, which are then used as solutions for development, rather than to apply mechanically other people’s cases.

In this way there will be endless cases, there will always be endless ways to solve the problem. This is "Tao ".


I was asked why I didn't create several pipeline dataflow design patterns like OO and FP, and I thought, in Chinese Classic Myth Fiction "The Journey to the West", Bodhi asked the monkey if he wanted to learn Tiangang 36 change patterns or Disha 72 change patterns, when the monkey chose Disha 72 change patterns, his failure has become a foregone conclusion, Bodhi who is the Taoist great God must understand the nature of the Tao, learn the laws of nature, The ever-changing truth, whether the monkey chose Tiangang 36 change patterns, or Disha 72 change patterns, his thoughts from then on stifled shackles, put on the invisible Tight curse, from then on can not approach "Tao ", this is not teaching, but playing monkey. :-)


Simplicity does not mean easy.
The pure pipeline system is a simple system. but simplicity does not mean easy.
Implementing a pure pipeline system is a systematic engineering.
Hard work must be done to build a complex system into a simple and smooth pure pipeline system.
This requires great wisdom and pays a lot of difficult Business Process Design (or Reengineering).

Many people think that sages have secret tricks, despise simple technology, and pursue complex, 
difficult, and sophisticated technology, but this idea is completely contrary to the facts.
        ---- Wang Yangming, 
             the most famous and well-known thinker, philosopher, calligrapher, 
             strategist, and educator in China

In addition, As long as you have carefully read the "pure function pipeline data flow",
you will find that I only use the most basic common sense to solve the problem,
and did not use any too complicated and delicate technology.
Common sense is human best practice or the most widely used and reliable theory.

The easiest way to learn

Write pure functions (pipes) as much as possible, and only use "pipe symbols" to link them together.
As long as this is done, the old programming thinking will naturally gradually change
and gradually become a new programming thinking that focuses on changes in data flow,
It will eventually evolve into this programming methodology ----
the pure functional pipeline data flow and principle-based warehouse/workshop model.

If there is no "pipe symbol", you can use the "assignment statement",
which is applicable in any programming language.

Principles-based are better than rules-based

Eating your own dog food

"Eating your own dog food" is a good way to demonstrate a theory.
If you cann't use a simple logical model
(or example) to fully demonstrate a theory,
it means that the theory isn't a practical, complete and systematic
theoretical system, but can only be regarded as loose and
messy practical experience.

This theory consists of the following five parts:

Each component of this theory is perfectly applied and
demonstrated other components, which shows that this
theory is a practical, complete and systematic theoretical system.

Computer science is essentially a management science, and vice versa.

Computer science is essentially a management science, and vice versa.
    ---- Lin Pengcheng
         Creator of Computer Science Management School
         Creator of Management Science Computer School

In the field of computer science, I applied technologies such as management principles, warehouse/workshop model, and large-scale industrial standardization assembly lines to the fields of computer software, hardware, and AI, and realized the unification of computer theory.

In the field of management science, my discussion of management principles and warehouse/workshop model is the most scientific, systematic, simple, reliable, clear, and operable.

Whether it is an IT practitioner who takes up a management position, or a manager who becomes an IT industry executive, they can use my theory as a bridge to another kind of science.

Clojure officially advocates pipeline programming in 2021

In State of Clojure 2021 Results, it advocates pure functional pipeline programming.

First, Clojure programmers value a functional style of
programming facilitating a separation of data and process.
Coupled with its suite of immutable data structures,
Clojure applications are often built as pipelines of
data transformation functions that can be composed
to implement higher-level business concepts in software.
Link: State of Clojure 2021 Results at

Regarding pure functional pipeline programming,
this article is the simplest, most comprehensive,
and most systematic article. It has industrial-grade
reliability, simplicity and unity.

End message

I spend my spare time developing my personal amateur project: Lin Pengcheng Financial Analyser.
Although my writing time is very limited, but I will gradually improve it.
compared to the content when I first set up the blog,
it has been rich and improved a lot.

Other Articles Table of Contents

English + Chinese


Tags: reading   articles   distribution   language  

Last modified 01 July 2021