TINAG – This Is Not A Guide 2 Intalio

Just an attempt to Document & Decode the Intalio Platform for Non-specialists

Archive for December, 2008

BPMN to BPEL: be careful

Posted by José D. De la Cruz on December 9, 2008

.
I just read the article in InfoQ written by Pierre Vignéras about BPMN and BPEL.

I was finishing my last entry to the blog but I think that that article is a bit misleading and I must react to it. Why? because it makes a point on some wrong interpretation of BPMN. In addition, this gives me a good excuse to explain some details about process algebras.

First, I must clarify that I like Bull and their suite, and that I could have rich discussions with the Bull guys and a couple of OW2 Project Managers in LinuxDays in Geneva. Besides, I used eXo in a project recently, and I think that’s a nice portal and that their implementation of the JCR-standard is just great. Bull’s philosophy is different from Intalio and very respectable, and Bonita is a nice-to-have BPM Suite. I like eXo and its family: Orchestra, JOnAs, the way the integrate…

Nevertheless, as an ancient researcher in this domain I cannot just let it pass a misinterpretation like the big one in his article, and specially the conclusion that he has created from that misunderstanding.
In a very short form, the process they describe is: a new employee starts working, and the company must proceed to some exams, and some logistics takes place (like giving her a computer). You can see the ~BPMN description here:
http://www.infoq.com/resource/articles/bpelbpm/en/resources/onboarding.png

Is this really a correct model?
Although Mr. Vignéras article starts with a discussion of structured and unstructured approaches to programming, his diagram is not structured. He just did not respect the most basic good practices for modeling in BPMN.
As in any language, in BPMN there are different levels (as I described in my last entry): lexical, syntactic, semantic and pragmatic. The two latter levels mean that not everything that is grammatically correct is necessarily a valid construction in the “common sense” understanding (If you want more on this, you should read Chomsky).

I must suppose that the author did not know in depth the substance he was dealing with. At least, he did not analyze the BPMN semantics of the process in advance. As a matter of fact, the BPMN model and the BPEL one MATCH, but it is the (textual) process description that is wrong :-O

In the description he says: “… When a new employee arrives at a company, a workflow is instantiated. First a record needs to be created in the Human Resources database. Simultaneously, an office has to be provided. As soon as the Human Resources activity has been completed, the employee can undergo a medical check. During that time, a computer is provided… “. If you face a description like this, you should ask yourself several things:

  • What activities do happen in parallel
  • What does it mean “During that time“?
  • How do you represent “can” ? and what about the other modal words?

Let’s Put Some Music
Let us decompose the description onto sentences:

  1. A = “First a record needs to be created in the Human Resources database”.
  2. B = “Simultaneously, an office has to be provided”.
  3. C = “As soon as the Human Resources activity has been completed
  4. D = “the employee can undergo a medical check”
  5. E = “During that time, a computer is provided”

Going further, I will isolate the temporal and modal operators:

  1. TA = “First”
  2. TB = “Simultaneously”
  3. TC = As soon as the … activity has been completed
  4. TD = “can
  5. TE = “During that time,

Then, back to the initial text we obtain:

  1. A = TA + “a record needs to be created in the Human Resources database”.
  2. B = TB + “an office has to be provided”.
  3. C  — let’s just forget it
  4. D = TD + “the employee undergo a medical check”
  5. E = TE + “ a computer is provided”

Now we can write different descriptions of the BP. The basic thesis should be that During that time means “something that happens in parallel“.
Let us consider a quite basic process algebra (you can adopt a more precise CSP or CCS or Pi-calculus if you want):

  • sequence (.)
  • parallel construct (||)
  • exclusive OR (ø)
  • inclusive OR (+), and
  • the epsilon joker (ε).

Mr. Vignéras wrote a model that can be represented by this equation:

( (A.C)+ (ε ø D)) || (E) ) || (B.C.E)        :-O

No matter why the employee receives two computers !!

Now if we read the textual process description, and if we simplify the undeterministic expression D from can to a deterministic “will certainly“, we may write the BP in 2 ways:

  1. f ( A.B.C.D).E — this is not valid because E is not in parallel.
  2. f (A.B.C.D)||E — this is valid but ambiguous

We must rewrite this last equation, but in a less ambiguous fashion. Because the sentence C establishes a causality between A and D, we can reduce the design space to those options that respect such causality. This does not, however, give us a unique option:

  1. (A.C.D) || B || E
  2. ((A.C.D) + B) || E
  3. ((A.C.D) || E) + B
  4. ((A.C.D) || (B + E)
  5. ((A.C.D) || (B.E)
  6. ((A.C.D) || (E.B)

As you can see, at no moment we can obtain the extrange model of Mr. Vignéras. In this case, we have exclusive paths to the E premise, i.e. a single computer given to the employee.

Are Exceptions Unstructured?
I was quite surprised with Mr. Vignéras’ view of exceptions as a non-structured approach. Actually, this was the real reason to feel uncomfortable when reading his article before I saw the BPMN modeling problem. He put the fault on Business Analysts, on the real world, and on everything else but a good structured programming/modeling language (his title reads “Business Analysts Write Parallel and Unstructured Processes”)… These are my reasons:

  • First, the 1968 article from Dijkstra was really focused on unstructured architecture of programs.
  • If one takes the time to study the more complete Dijkstra’s “A Discipline of Programming” (from 1976),  that is also more modern than the 1968 article, one can conclude that the essence of his message is that the modules have contracts (I won’t go into the details of the guarded command language). Each contract is fulfilled if the pre-condition is respected. This does not exclude the introduction of exceptions, because they are basically the result of not respecting those pre-conditions. Dijkstra is very clear: “If the precondition is not respected, we can say nothing about the postcondition“… the rest does not come from him.
  • The work done by Meyer for building the exception treatment that was eventually built onto the Eiffel language (and re-adopted by all major OO-languages) is actually a nice proposal to a very structured way of dealing with erroneous conditions. It was a counter-proposition to defensive programming, and it guaranteed that someone had to deal with the error. It instantiates a very well-known and structured pattern: the chain-of-command, that’s it.
  • The fundamental and complementary works of Parnas (modularity), Liskov (the Substitution Principle), from Rebeca Wirfs-brock group or the Fusion method group (that dealt with roles and design principles), are perfectly compatible with the notion of exception.
  • All these proposals are built around the Hoare Triple, and if you go really deep, the famous triple is another “not uncompatible” specification tool.
  • The language theory, specially the grammar automatas, say that a grammar is accepted if the path arrives to an endpoint.  I’d like to see this comunity’s reaction when considered as non-structured.

The unstructured nature of graphs (?)
Let us cope with graphs nature: I’m not fond of graphs, and I am almost allergic to them (more on that when I’ll come to my Ph.D. dissertation some day). This does not mean that graphs are bad, because they are useful.

For example, I kind-of-hate them but I have to admit that when working on parallel computing and data-flow, they really helped us structure the solutions. However, you had to be aware of their power and their flexibility.

Graphs are useful not only for parallelizing frameworks and middleware. When I worked on real-time systems, I also enjoyed using Structured Analysis models (for example Hatley & Pirbhai, some 16 years ago), that were kind of data-flow + control flow (quite similar to BPMN when zoomed-in) and helped you understand the complex relations among architectural modules.

Yes, graphs can get really wild, and I agree that you can create monster models, but they are not necessarily unstructured. As a matter of fact, in the cases mentioned above a graph was a hyper-useful means to structure your solution. It all depends on your use of the modeling techniques.

CONCLUSION
The author of the article mentioned above wrote a wrong BPMN model, then he wrote a process description that does not correspond to the diagram, and finally he explained a 2nd wrong model in BPEL. I know his goal was to prove something else, but I do not appreciate being mislead.

I hope I proved many fundamental notions in Computer Science do not share the definition of structured that was presented by the author. I also could show that  there is no single way to define “structured”, even for computer languages.

I think you can miss the point if you do not go far enough on your analysis. Brooks explain this very well when he calls this complexity, and differentiates essential from accidental.

  • For more curious people, please read the incredibly illuminating discussion of Weinberg on state and structure, as well as the work of Barwise and company on the modeling of logical systems (Stanford’s CLI).
  • I also invite you to read Mr. Bruce Silver discusses the distance from BPMN to BPEL, from market strategy to the issues related to the implementation of BPEL standards.
  • I will review my model once in the future, in order to check that everything is right. No guarantees for the moment.
  • I may add to it that XPDL (used by Bull’s products) is an interesting and may-be-extremely-structured language, but I am quite confident that you can also write things that do not make sense (on this domain, I only trust formal proofs and/or model-checking). Of course, I won’t spend time on doing that: it is possible, more than probable, and completely useless.

I do not put more pointers or bibliography, because of lack of energy. Sorry. Lo siento.

Posted in BP Modeling, BPMN | Tagged: , , , , | 1 Comment »

Pre-BPM Pragmatics

Posted by José D. De la Cruz on December 3, 2008

In most human systems communication in Business Processes is normally done via the “data transfer” (“fill form XYZ-02A1, sign it and give it to me”) where each participant transfers segments of data that are then somehow put in a bulk of data (“let me put this into your application folder and I’ll come back to you”). This approach considers:

  • each source/sink of data an isolated element in the puzzle (“silos”)
  • the sources/sink of data are difficult to interact with, and
    therefore communication time should be optimized (reduced to the
    minimum possible size)
  • interactions are not interactive, but mostly single-shot. This guarantees the integrity of process execution.
  • each source/sink of data is a trusted participant:
    • Business logic is somehow built in each one
    • each knows how to do its task and does it correctly
    • The whole bulk of aggregated data is passed from one source/sink to the next

AN EXAMPLE
Let us suppose that we have this long-running business process. It is some application that requires legal papers (visa/University/building a house/etc.). I choose this as example because I’ve seen an increase in the number of e-government initiatives, but most of them only give you access to some PDF document you can print.
I consider in this example that the application process is compatible (but not yet implemented) with the notion of “electronic record” or a dossier that can be made up of digital files.

If you take into account the tacit modeling principles listed above, no wonder why you receive mails like this after submission:
“We hereby confirm that we have received your application folder. Do not contact us. You should receive a response within 2 months. The processing delays are notified via our website.”

Besides, the bulk of data is built using non-scalable/non-incremental means, like hardcopies (“please print this form and include it in your application folder”) that do not support versioning, and that can only be changed by a total blocking of the processing.

Let us say that you receive, after 2 months waiting:
“Dear Mr./Mrs., we have found that section 13, paragraph B, line 3 is not readable. Please correct it and re-submit your application folder… Do not contact us. You should receive a response within 2 months after re-submitting. The processing delays are notified via our website.”

Sounds like realistic to you? Let us say now that you changed what you were told to change, but because of the processing times, a legal certification –valid for only 3 months– will expire during the resubmission. You were careful but he asynchronous nature (silo + slow communication) you cannot avoid it. Therefore, this will invalidate your application once more.

DEADLOCK IN PRE-BPM SYSTEMS
Let us suppose now that you guessed the expiration of this legal document might be a problem. You prevent
this issue and get your legal paper renewed after sending the application, just in time before the three months expiration delay.
You send it, so your application should be updated with a valid version of the legal paper. However, the application is considered as incomplete and will be sent back to you.

Why is this happening?
The processing team considers that adding a new element “is against the rules” and cannot be done without “opening the door to other people abusing the system”.
No smart administrative chief will agree on doing that.

Worst, the current, valid version of the legal paper is no longer in your hands, but somewhere in that organization.

Transfer Bulk of Data Patterns
The data-transfer approach in pre-SOA and pre-BPM makes the bulk of information of application X be on top of a pile in bureau of A, and then somewhere in bureau of B, and so on. When sensitive information cannot be seen by B, some person AB has to be included to hide/exclude that information from the bulk. Once B finishes his processing, then AB can rebuild the bulk X.

This introduces delays, complexity and, thus, increases the probability of errors. In a paper-based world, this is translated into more workload per employer, more focus on repetitive/tedious tasks than on doing value-adding work, and more stacks of papers and of procedures to process them.

Then, scalability gets even more compromised.

CONCLUSION
This is not flexible, this is not scalable, and this is not proper: the user is not satisfied, the process is not transparent, and the service provider cannot really assess where the problem is. Then, there is no improvement.

How come that we have technology to interact with the rest of the humanity (mostly for fun) and not with the processes that really add value to our jobs and even our lives?

This can have a great economic impact. In Switzerland, where I live, a study demonstrated that obtaining a working permit for an EU-national takes 10 weeks but that the real processing time takes only 15 minutes. For non-EU nationals this takes even longer.What is the cost of these delays? the lost revenue for all the stakeholders? the stress?

As the say here: Bouf! :-<

Posted in BP Modeling | Tagged: , , | Leave a Comment »