≡ Menu

Scrum please!

Last week I had the pleasure of attending a Scrum Master training, given by Zilverline. I have to admit, I was skeptical at first about Scrum. I am a trained Prince2 Practitioner, so I figured, this is probably “just old wine in new bags”, as we say in Holland.

And in a way that is the case of course. But still, the principles of Scrum are very attractive. Especially since I realized I was already working very agile. This has mainly to do with the environment and product I work with. In the environment I am a trusted resource, that has quite a lot of freedom to put product increments in production. The product owner is “a man with a vision” who can look at the bigger picture. And the product is BI, where quick prototyping is needed to demonstrate the potential of a new report, and implementation time from idea to product can be fast.

What I liked about Scrum is the limitation of overhead. 4 meetings, 3 roles, 2 lists. Compared to Prince2 that is a lot less overhead to maintain. The roles are very limited and clear. The role of the Product owner makes a lot of sense, especially since this role is carried by one person.

The Planning Poker was something I like very much as well. I think this can be a good teambuilding exercise, and also, it focuses on the principle that estimating is very difficult, and should be regarded as an exercise that should not be the main goal.

It reminds me of a quote from Terry Pratchett, Going Postal:
“Mr. Pony struggled manfully with the engineer’s permanent dread of having to commit himself to anything, and managed, “Well, if we don’t lose too many staff, and the winter isn’t too bad, but of course there’s always—”

So in that sense, if you do not have to commit completely, but just go for an order of magnitude, you get quickly a good enough estimation of the workload. And you can start the work!

In my current project, we could do with a bit more structure and additional teambuilding is necessary to tear down the artificial organizational boundaries. I am convinced Scrum will be helpful in achieving that goal.

I passed the test, so I can now call myself Scrum Master. I do feel like a sage!

To conclude: Scrum please!

Enterprise deep learning with TensorFlow

An interesting course offered by openSAP is Enterprise deep learning with TensorFlow, which is currently running in its last week. I found this a great insight in the current state of machine learning possibilities.

It was a very hands-on training where it was possible to play with TensorFlow Applications, an open-source library for numerical computation. For SAP, TensorFlow is a key element in the SAP Leonardo Machine Learning architecture. With SAP Leonardo, SAP aims to make machine learning euasy to use for businesses.

Deep learning is a sub-field of neural networks, machine learning, and artificial intelligence. It is inspired by the architecture of the human brain and consists of neural networks with many layers.

Deep learning is a promising approach when:

  • there is a large amount of training data available
  • it concerns solving an image/audio/natural language problem
  • the raw input data has little structure and it is needed for the model to learn meaningful representations (e.g., pixels in an image)

One of the topics in the course was about convolutional networks. Convolutional networks are used to classify objects on pictures. The complexity to do this is enormous, but with combining several techniques and doing smart optimizations it becomes possible.

Also some examples of use cases were given. One of them was a Medical Image Segmentation with Fully-Convolutional Networks. In this example images retrieved from an MRI scanner are processed with a fully convolutional network to construct a new image that points out possible cancer cells.

I found the explanation on how to deal with unsupervised and reinforcement learning very informative as well. To explain:
Machine learning applications fall into three broad contexts:

  • Supervised learning; in this case there is dataset with labels or annotations. Usually this dataset is not too big, because it is costly to label all the data. Most machine learning is done with these.
  • Unsupervised learning; in this case there is a data without labels or annotations. Typically this data is generated with machines or software, in an internet of things kind of way. With machine learning there are techniques to identify anomalies and outliers of the data. Making good use of this data. An example can be a financial pattern that is monitored. When an anomaly occurs, this can be due to fraud.
  • Reinforcement learning; In reinforcement learning there is no initial dataset. The dataset is accumulated with experience. The machine learning agents interact with the environment in an trial an error kind of way. An example is a robot learning a task. It performs actions, and when the action is correct it is rewarded, when the action is incorrect, there is no reward and a penalty.

Another inspiring example was the generating of new images using GANs (Generative Adversarial Networks). In this example a generator generates images, and this is combined with an discriminator that determines if the pictures is a real or fake image (i.e. blurry). This approach gives impressive results.

To conclude: another very inspiring course from the Open SAP learning environment. Very useful machine learning techniques for businesses were presented.

Getting Started with Data Science

O

n March this year I enrolled in the open sap course ‘Getting Started with Data Science’. It was a great insight in the business value a data scientist can have and how SAP can make their life easier.

Some elements I like to point out.

Project methodology

What I really liked in the course is the use of the cross-industry standard process for data mining (CRISP-DM) to walk through the steps of the process. Using this methodology the data science process becomes reliable and repeatable by people with little data science background. It provides a framework for recording experience and allows projects to be replicated.

https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining

In the first phase, the business understanding phase, the goals of the project are determined. The success of the project must be described from a business perspective and a data science perspective.

The success criteria for the different data science models will differ depending on whether the models are predictive or descriptive type models and the type of algorithm chosen.

Descriptive analysis describes or summarizes raw data and makes it more interpretable. It describes the past– i.e. any point of time that an event occurred, whether it was one minute ago or one year ago. Descriptive analytics are useful because they allow us to learn from past behaviors and understand how these might influence future outcomes. Common examples of descriptive analytics are reports that provide historical insights regarding a company’s production, financials, operations, sales, finance, inventory and customers. Descriptive analytical models include cluster models, association rules, and network analysis.

Predictive analysis predicts what might happen in the future – providing estimates about the likelihood of a future outcome. One common application is the use of predictive analytics to produce a credit score. These scores are used by financial services to determine the probability of customers making future credit payments on time. Typical business uses include: understanding how sales might close at the end of the year, predicting what items customers will purchase together, or forecasting inventory levels based upon a myriad of variables. Predictive analytical models include classification models, regression models, and neural network models.

Data understanding and Data preparation, steps 2 and 3 in the process, are the most time consuming in the process and takes up for about 50% to 80% of the total time.
A friend of mine is a data scientist and he could totally agree with that. He considers this step very important to get a ‘feel’ for the data.

After the data understanding and data preparation, the modelling starts. There are a lot of models available, depending on the problem you are trying to solve.

Descriptive models:

  • Association
  • Clustering

Predictive models:

  • Classification –bivariate target variable
  • Detect anomalies or outliers (data cleansing or decision support
  • Regression – continuous target variable
  • Forecasting with time series data

In the course, some very nice exercises were given that gave a better understanding of what it is to work as a data scientist with the SAP Predictive Analytics tool.

With this course my understanding of the work of a data scientist has increased considerable. A data scientist spends most of his/her time getting to know the data. That makes a proper working datawarehouse relevant as a source for reliable data. Also, when a model has been approved, the rules of the algorithm can then be incorporated in the BI environment to monitor the predictions and use the outcomes in an easy manner.

To conclude: SAP Predictive Analytics is a great tool for a data scientist to use and SAP Predictive Analytics can thus increase the value of the BI environment as a whole.

SAP Lumira and Tableau compared

SAP Lumira and Tableau are both data visualization tools that can be used to explore data. Recently I did small projects in both tools, so I am finally able to make a comparison.

Tableau
Based on my experience of working with Tableau (version 10.3) I found that the tool is amazingly rich. There is great functionality for building graphs, especially when it comes to visualizing distributions of data. You can build nice tooltips that give additional information. Within the dashboard view it is easy to create a responsive layout, so you can use your dashboard on different devices. When I needed to figure things out, I could go online for help. For most topics I found the answer. I worked with the free online option, and that is a very generous tool already. What I also liked, is the concept that the tool just does things without prompting ‘are you sure’. That makes the feeling of the tool very quick. Of course you need to press the undo button quite often, when it turns out that you didn’t really want to do it, but you get used to that.
What I found difficult to work with was dealing with the horizontal and verticals grids. It felt that things happened at random there when I tried to place the worksheets or other components in the grids. The swearing jar got filled very quickly. Also I tried to build a story, but I didn’t find it a very useful option, since I couldn’t do a lot of customizing. Another thing, all this great stuff is sometimes hidden very well, so you need to do a lot of trial and error.

SAP Lumira
The data visualization tool SAP Lumira (version 1.31) has a nice focus on the process (data preparation – visualize – build a story). Compared with Tableau it lacks a lot of additional functionality. The basic graphs are covered, but special graphs as for example box-plot are very limited. Working with the tool does not feel quick, since there is a lot of prompting. Also when building a restricted or calculated measure this is more complicated, and less functionality is available. When it comes to online help, the SAP Community has been nearly murdered by implementing a new platform last October, since then a lot of contributors have been lost. Of course, the big advantage of SAP Lumira is in working within an SAP environment, you can benefit from better integration.

So I must confess that I have become a big fan of Tableau. SAP will deliver a new Lumira release in the coming months, it will be interesting to see if they have been able to bridge the gap.

To conclude: SAP Lumira is a nice tool to do the basics in data visualization, but Tableau is the tool to use when you need rich functionality.

HERUG 2017

HERUG2017 Day 5, Amsterdam, 13 April 2017

Last month (April 2017) I had the pleasure to attend the Higher Education & Research User Group (HERUG) conference in Amsterdam.

I was also an presenter at two sessions. On Tuesday I was presenting together with Pieter-Jan Aartsen of the UvA about the project we did with the implementation of the IMR.
Partly this has been described in this blog.

On Wednesday I had the pleasure to be presenting together with Masood Nazir of the UvA about the project we did concerning the implementation of the evaluation reports.

Both of the presentations can be downloaded from the Herug site: Developing management dashboards for University of Amsterdam and Quality evaluation reports with SAP Design Studio.

I also visited some interesting sessions myself and got in touch with useful contacts.

To conclude: It was a very interesting conference.