How to design a modelling language ontology from empirical data?

I don’t know how yet. I’ve been thinking about this for about two years now.

The only idea that stuck so far, is to work in five steps.

Firstly, to identify recurring terms in the domain, used to describe problems and solutions.

Then, to somehow estimate their relative importance to people who are experts in that domain, in identifying and solving problems.

Thirdly, to define the concepts for the ontology, and then, the relations. Relations are the harder part.

Finally, to define rules for concept and relation use.

We are going in this direction, in the work on requirements elicitation, but we still have a long way to go. We are, roughly speaking, at the second step.





What is missing in formal models of argument?

Formal models of argument, such as Dung’s argumentation framework, usually do not answer the following interesting questions:
- How to detect groupthink in arguments?
- How to check if arguments are specific enough to the context and topic, so as to sanction the use of generic arguments?
- How to how to detect the availability bias in arguments?
- How to evaluate the relevance of an argument?
- How to evaluate the relevance of the attack of an argument on another one?
- Which extensions to prefer, and why?
- How to detect the manipulation in arguments and attack relation, in favour of one or a subset of extensions?

There are many others, but there is relatively little work that I know of, on the questions above.





How to write emails?

I follow the rules below when writing emails. Many people who work with me do the same. I highly recommend them.

They are inspired by similar rules which Nikola Tosic, Andrea Toniolo, and I designed and use at JTT Partners, to coordinate remote teams efficiently.

If you are a student, apply these rules, and I will reply faster to your email.

Rules:

- No bcc (blind carbon copy).

- One topic per email.

- Email subject should clearly state the topic.

- One thought per paragraph.

- Short paragraphs.

- One empty line between every two paragraphs.

- One verb per sentence, if feasible. In general: minimise the number of verbs in a single sentence.

- No passive voice.

- No “we”. Say who.

- If you want to ask me something, then include one or more clear questions, which end with the question mark “?”.

- If you want me to do something, then say what, and suggest a deadline.

- If you want to meet me, then propose at least two meeting slots.

- If you need my approval for something, then the word “approval” has to appear in the question.

- Titles and other formalities do not matter to me. I will treat you with the same formalities (or absence therof) that you treat me.





Why a syntactic consequence relation, if it tries to represent human reasoning, is probably not monotonic?

A syntactic consequence relation relates two formulas of a formal logic, if and only if it is possible to apply the proof rules of that logic together with (or on) the first formula, in order to deduce the second formula. The symbol for the syntactic consequence relation is called “turnstile”, is written \vdash in Latex, and looks like “|-”.

So if you write X |- Y in some logic you like, and if the turnstile is well-defined in it, then it means that there is a proof, which can be made using X and the proof theory of that logic, to deduce Y. Note that X may be a set of formulas, which in classical logic is the same as saying that Y is a conjunction of all formulas in the set.

Now, the syntactic consequence relation in classical logic is monotonic, or satisfies the property called monotonicity. This means that if you write X |- Y, and later, you add W to X (say, you make a union of all formulas in W and all those in X), you will also be able to write X + W |- Y, and you would not be wrong. By “+”, I mean some operation where you keep all formulas of X, and add some new ones, which are in W. So you are not removing something from A.

Informally, if you have a syntactic consequence relation, adding new formulas (very roughly, new information) to the set of those you have, still lets you draw (deduce) the same conclusions you could before.

The problem with this, is that it does not look like a good property when you are trying to define a turnstile which somehow resembles human reasoning. The reason is that new information may interact in such ways with old, that you cannot draw the same conclusions. So you still have A+W, but now, you cannot deduce B, or perhaps you can, but at least some of the conclusions that you could draw from A alone, cannot be drawn anymore from A+W.

If a turnstile violates monotonicity, then it is non-monotonic. There is considerable work on non-monotonic logics. I like formal argument systems in particular, and recommend this survey, if you are interested: Chesñevar, Carlos Iván, Ana Gabriela Maguitman, and Ronald Prescott Loui. “Logical models of argument.” ACM Computing Surveys (CSUR) 32.4 (2000): 337-383.





When to solve a decision analysis problem using an argumentation system?

In short, the alternative that maximises expected utility, and is therefore the solution in a decision analysis problem, can be seen as the only acceptable argument in an argumentation system (where you can understand the argumentation system as in, for example Dung, Phan Minh. “On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games.” Artificial intelligence 77.2 (1995): 321-357).

The reason why this simple observation is interesting, is that it is usually hard to elicit or otherwise obtain the information needed to quantify the uncertainty and desirability of alternatives, and from there find one that maximises expected utility.

In contrast, an argumentation system can be constructed by searching for counterarguments for each alternative, until only one is acceptable. The arguments can be any kind of information, as long as the individuals involved in argumentation recognise them as arguments.

In practical terms, when you cannot formulate a problem as a decision analytic one, perhaps you can formulate it as an argumentation system, and look for a solution by adding counterarguments to the system, until only one alternative remains acceptable, and all others are not acceptable.





How to (pragmatically) teach someone the basics of process modelling?

If person A wants to learn the basics of process modelling, have A in a kitchen, have someone else prepare a dish. Ask A to write down, in a numbered list, the steps she would do in order to prepare the same dish herself.

After A does the list, ask her to check the following, and update her list accordingly:
- Is the sequence of steps clear?
- Are inputs to each step clearly stated?
- Are outputs of each step clearly stated?
- Is it clearly stated what to do in each step, to produce outputs from inputs?
- Is it clear who does each step?
- Is it clear what tools and resources (fruits, vegetables, eggs, etc.) are needed at each step?
- Are there steps which need to be repeated? Is this clear from the list?
- Are there steps to do in parallel? Is this clear from the list?
- Does the list say which conditions should be satisfied, in order to start and stop a step (something should have changed colour, texture, water should have come to a boiling point, etc.)?

After they have updated the list, recheck these same questions, and do so until the list of steps is satisfactory.

Then, ask A to make the dish herself, and during and after this, ask her to recheck the same questions, and update her list of steps according to the experience of preparing the dish.

After the above, the person will probably more easily grasp the purpose of process modelling, the difficulties, limitations, and probably process modelling languages as well.





What is the relationship between business analysis and decision-making?

If a decision is a commitment to a course of action, then decision-making designates the cognitive processes that result in such commitments.

The conservative view of decision-making, visible in such fields as expected utility theory in economics and decision analysis in management science, is that the individual who makes the decision identifies alternative actions in a given situation, evaluates the each alternative relative to others, and chooses the one which is, according to some set of criteria which this person adopted, the best among the alternatives. The decision is the commitment to act according to the best alternative. For less conservative views in the said fields, search for “nonstandard utility theory” at Google Scholar, for example.

Business analysis involves decision-making, but has three differences, one in terms of focus, and two in terms of scope. One is that decision-making theories usually are not not specific to a domain, even if it is as broad as “business”, while business analysis focuses on business situations. The second is that decision-making is less concerned than business analysis with methods to apply, to collect and elicit the information about the problem, alternatives, and criteria. The third is that business analysis usually produces advice, rather than the actual commitment, the former being a recommendation on how to act, the latter being the adoption of a course of action.

In short, then, business analysis involves decision-making, but focuses on decisions related to business problems / opportunities, is interested in how to get information for, then design / define the problem, alternatives, and criteria, and produces advice on what to decide, not decisions themselves.





What is a business analysis method?

A business analysis method is a set of well-defined tasks to do, either in order to solve problems which are obstructing organisations to create value, or to realise opportunities that these organisations have identified.

Tasks are well-defined if it is clear:
- why they need to be done,
- what their inputs and outputs are,
- how to transform the inputs into outputs,
- what skills are required to do so,
- which resources are used.

It is at least as important that it be clear:
- why some task has to be done instead of another,
- why it has to be done as described and not in another way, and
- which tasks precede it, follow it, and can or should be done in parallel to it.

A business analysis method is likely to include tasks which explain how to do the following:

- understand the problem or opportunity, which involves preparation, to understand the terminology and practices in the relevant problem domain, and the elicitation of information and knowledge from that problem domain and the people involved;

- synthesis of the collected information and knowledge in order to produce a clear formulation of the problem;

- design of the solutions, which results in the description of alternative solutions;

- evaluation of the alternatives, which consists of identifying their respective merits, limitations, risks;

- selection of one of the alternatives, based on the results of the evaluation;

- recommendation to implement the chosen alternative; and

- supervision of the implementation of the solution, and its adjustment based on what is observed during implementation and use of the solution.





What is the “median entrepreneurial startup”?

It is the average new tiny or small company, rather than the rare, highly successful startups described in the press. It is useful to leave the hype aside, and keep in mind the following, when deciding to invest in, work with, or work for most startups:

“Starting such a firm is like entering a lottery (Storey, 2011; Vivarelli, 2011: 201), with high death rates, skewed returns with most players losing out, random growth, little or no entrepreneurial learning (“learning to roll a dice” [Frankish et al., 2013]), no influence of education on performance, little control over outcomes but substantial overconfidence among players. Like the median lottery player who does not make money after arguably irrationally entering a game where the average payoff is less than the ticket price, most entrepreneurs do not gain a wage premium compared with waged workers. Like lottery players they are psychologically happier, which may be related to them being more optimistic and overconfident (Camerer and Lovallo, 1999; Parker, 2004). As with lottery players, it is not clear that unsuccessful entrepreneurs should be encouraged or subsidized to try again, given that the evidence on entrepreneurial learning from large-scale studies of unsuccessful entrepreneurs is generally weak (Metzger, 2006; Frankish et al., 2013). And lastly, as with lottery players, a tiny minority of “winners” is very visible in the popular press, while the large number of losers is overlooked.”

The quote above is from: Nightingale, Paul, and Alex Coad. “Muppets and gazelles: political and methodological biases in entrepreneurship research.” Industrial and Corporate Change 23.1 (2014): 113-143.

Despite all that, I like working with happy people, so I appreciate working with all kinds of startups. I recommend the same to you :-)





Distinguished paper award at CAiSE 2014

Corentin, Stephane, and I received the distinguished paper award at the 26th International Conference on Advanced Information Systems Engineering, for is paper:

Burnay, Corentin, Ivan J. Jureta, and Stéphane Faulkner. “An Exploratory Study of Topic Importance in Requirements Elicitation Interviews.” Advanced Information Systems Engineering. Springer International Publishing, 2014.

CAiSE selected three papers, of which one was elected the best paper, and two are selected as distinguished papers. Ours was one of the two distinguished papers. Great news :-)





Why is it hard to analyse decision methods?

A decision method explains how an individual, or a group of people make a decision. If a decision method is known, and can be taught, then it can be applied by others, when they need to make decisions.

Democratic elections are an example of a well known decision method. If you want to elect a president of something via democratic elections, then there is a set of rules you need to apply, such as that every vote counts as one (and not more or less than) vote, the candidate who receives more than half of all votes wins, and so on, no voter should be coerced to vote, and so on. Various decision methods that people and animals apply are surveyed in Conradt, Larissa, and Christian List. “Group decisions in humans and animals: a survey.” Philosophical Transactions of the Royal Society B: Biological Sciences 364.1518 (2009): 719-742.

Now, suppose that you need to analyse a previously unknown decision method. By this, I mean that you are given access to the individual or more of them, who are using some rules to make decisions in relation to a particular decision problem, but there is no documentation of these rules, and you want to define the rules that they apply, then see if there are ways to change tem, in order to improve some aspect of their decision making, such as remove some bias, speed it up, etc.

This is hard to do for many reasons. For example, it may be hard for them to give you the rules when asked, because they do not see clearly the regularities in their decision making (or there may be none).

Another issue is that it can be difficult to isolate the actual decision you want to focus on. If a decision is a commitment to a course of action, then it may not be accessible to you, since commitments may only have direct effects on these people’s thinking, and only later produce tangible changes that you can observe (a document which says when and what the person decided, for example, such as a contract).

A third issue is that it may be difficult to isolate a decision problem that you want to analyse the decision method for, because it may be occurring systematically together with some other decision problem.

The issues above, and others, are nicely discussed in Langley, Ann, et al. “Opening up decision making: The view from the black stool.” organization Science 6.3 (1995): 260-279.





Why product design is “not great” in open source software?

I sat at a fairly general talk on open source software that Anthony Wasserman gave at the CAiSE 2014 conference.

The majority opinion in the audience was that product design is “not great” in open source software, and certainly less advanced than the software engineering in such software.

It was suggested that there is not much attention in open source software on product design.

An alternative, equally weak explanation, can be that it is easier for a community of software developers who participate in an open source software project, to decide if one piece of code is better than another, while it may be harder for them to decide (they lack the expertise to draw as clear criteria) if one product design decision is better than another.

In any case, it would be good to look at a few dozen, or hundred open source software projects, at how they make product design decisions, because they inevitably make them, and see if there are any recurring factors, which influence how the team makes product design decisions.

Looks like a nice MSc thesis research project.





What is domain-independent conceptual modelling and what is the risk in doing it?

Domain-independent conceptual models are designed so that they can be used in different domains.

So, a relatively simple example would be, say, a model of a family, which describes family members and parenthood, and such relationships between them. Perhaps the most complicated examples are foundational ontologies, such as DOLCE and UFO. CYC is also related to this.

While I understand that such conceptual models could be reused, I think that making them is risky, for several reasons. Two of them are below.

One is that models normally should serve a precise purpose, and reuse is not a precise purpose, in that it does not help determine the appropriate scope of the model (scope being what the model should, and therefore also what it should not represent). Reuse does not tell me specific questions that the model should answer. In other words, if you start making a conceptual model, and your main goal is reuse, rather than solving some specific problem with that model, in a specific domain, there’s no way of telling what you’ll end up with. How will you delimit your model’s scope, and decide its depth, for example?

Another is that I simply do not believe that there exist domain-independent conceptual models. I will when I see one. Even if you give me a highly generic conceptual model of a family, it may look generic simply because it reflects common sense notions about family, which at the same time makes it poor for reuse in, say, the legal domain, for keeping data about parenthood, childcare, and so on.

I think there is much less risk of making generic and irrelevant conceptual models, if you make conceptual models with a precise, specific, and pragmatic purpose in mind, rather than pursue some ideal of reuse.

Going back to the model of family, if you need to make it for use in the legal domain, for example a court for minors, then make a conceptual model which is highly specific to that setting. Then, if you need to make a model for use in insurance, again, make a specific one. The design of the former can inform the design of the latter, but claiming one best model that can be reused in both these settings comes with the problem that you may be forcing a common sense conceptualisation of a family into a domain which has its own, highly specific conceptualisation of a family.

As a final note, I am not saying foundational ontologies are irrelevant, only that very few people, if any, can make them in such a way that they are useful for practical knowledge engineering. I see their benefit mostly as material that helps discuss recurring issues (say, what distinguishes endurants from perdurants?), identify best practices in ontology engineering and conceptual modelling, and for education in these two fields, and more broadly in AI.





What is a Language Service, and why use it in modelling language design?

The usual way of defining a new modelling language (languages like UML, BPMN, but also formal logics, such as LTL, CTL, paraconsistent logics, etc., really any language you can make models with) is to first decide what its purpose is. You then find and define the concepts and relations, which should be used to create models with that language, so that you can use these models to realise the purpose. If it is a formal language, then you define and prove some desirable properties that its models have.

If the purpose is, say, to model processes, then you probably need a concept for the smallest (primitive) steps that you want to make processes of, and relations to say that one step is before another, is done in parallel with another, and so on. So in that language, you can say in a model that step A is before B and that B is before C, and that C has to be done in parallel with step D.

It can be difficult to precisely define the purpose of a modelling language. To say that a language needs to model processes is not precise, which is illustrated by the variety of modelling languages which already exist, and all model processes in (sometimes only slightly) different ways.

The practical problem is that if you do not precisely define the purpose of the language, then designing it is harder, as your target starts moving: you make a language that can represent sequences of steps, and then you discover that you need to show parallel steps, and then, perhaps you think it is a good idea to enable it to show the timing of steps, and so on.

This is an issue I’ve seen over and over, and not only with young researchers.

A simple way to deal with these problems is to define Language Services that the language should deliver, before you choose the language components, define how they work together, and so on.

A Language Service is simply a question that you want ANY model of the language to be able to answer, and the answer should be the same, regardless of who asks (you, someone else, a machine that can read the models).

Here is an example of a Language Service:

Given a model M and a step x in M, which steps should be done before doing x?

Different languages can deliver this Language Service, but if a language defines what a step is, and conditions when a step is before another one, then it is clear how to compute the answer.

If the above was the only Language Service that a language had to satisfy, then the following language, call it P, is the simplest one which does so.

In P, any model is a set of expressions, and every expression is a pair (a, b), where a and b represent steps, and (a,b) represents that a should be done before b.

So P isa language with one concept, called “step”, and one relation, called “before”. If the before relation is transitive, then to answer the question in the language service, you have to find the transitive closure of the graph, in which nodes are steps, and edges are before relations. The answer is the set of all steps, on all paths that end in the step x. And the answer is independent of who asked, that is, it is not open to interpretation.

So the next time you need to make a modelling language, start by listing the Language Services that it should deliver.





Why security certification will be regulated by law for some classes of software?

If some software is critical for the operation of basic infrastructure of a country, like healthcare (for example, software for keeping patient records), or transportation (software for managing rail and air traffic), or policy-making (software for recording votes in a parliament), then why are there no institutions, or other kinds of (hopefully) independent organisations which would check the designs of that software, then test it, in order to give a time-restricted certificate that it can be used?

I’ve just heard a talk on cutting edge research on software security, and it mentions nothing about incentives, sanctions, the role of social and legal environments in which software runs, about certification, who may do it, how.

Yet the talk was given in a hall which has fire protection systems, which had to be certified during building design and then at delivery, before the owner of the building can put to use. That certification is regulated by the state.

I do not see how software which I mentioned above is allowed to run without being certified in analogous ways to what is being done in construction, in aviation, and many other domains.

This may be due to the youth of software engineering relative to more established engineering fields.

I expect that in the next 10 to 20 years, there will be more and more research and public focus on the regulation of software engineering, which will involve, among others, the design of incentive and sanction mechanisms, certification procedures, which would all need to be reflected in new laws.

I see that trend as inevitable, given how critical software is, or is becoming for running public services.





next