Change search
Link to record
Permanent link

Direct link
Johansson, Ulf
Publications (10 of 75) Show all publications
Giri, C. & Johansson, U. (2021). Data-driven Business Understanding in the Fashion and Apparel Industry. In: : . Paper presented at The 18th International Conference on Modeling Decisions for Artificial Intelligence, On-line (from Umeå, Sweden), September 27 - 30, 2021.
Open this publication in new window or tab >>Data-driven Business Understanding in the Fashion and Apparel Industry
2021 (English)Conference paper, Published paper (Other academic)
Abstract [en]

Data analytics is pervasive in retailing as a key tool to gain customer insights. Often, the data sets used are large, but also rich, i.e., they contain specific information, including demographic details, about individual customers. Typical usage of the analytics include personalized recommendations, churn prediction and estimating customer life-time value. In this application paper, an investigation is carried out using a very large real-world data set from the fashion retailing industry, containing only limited information. Specifically, while the purchases can be connected to individual customers, there is no additional information available about the customers. With this in mind, the main purpose is to discover what the company can learn about their business and their customers as a group, based on the available data. The exploratory analysis uses data from four years, where each year has more than 1 million customers and 6 million transactions. Using traditional RFM (Recency, Frequency and Monetary) analysis, including looking at the transitions between different segments between two years, some interesting patterns can be observed. As an example, more than half of the customers are replaced each year. In a second experiment, the possibility to predict which of the customers that are the most likely to not make a purchase the next year is examined. Interestingly enough, while the two algorithms evaluated obtained very similar f-measures; the random forest had a substantially higher precision, while the gradient boosting showed clearly better recall. In the last experiment, targeting only the customers that have remained loyal for at least three years, rule sets describing patterns and trends that are strong indicators for churn or not are inspected and analyzed.

Keywords
RFM modeling, Churn prediction, Fashion and apparel
National Category
Business Administration Computer Sciences
Research subject
Textiles and Fashion (General); Business and IT
Identifiers
urn:nbn:se:hb:diva-26470 (URN)
Conference
The 18th International Conference on Modeling Decisions for Artificial Intelligence, On-line (from Umeå, Sweden), September 27 - 30, 2021
Available from: 2021-09-20 Created: 2021-09-20 Last updated: 2021-09-29
Sweidan, D. & Johansson, U. (2021). Probabilistic Prediction in scikit-learn. In: : . Paper presented at The 18th International Conference on Modeling Decisions for Artificial Intelligence, On-line (from Umeå, Sweden), September 27 - 30, 2021..
Open this publication in new window or tab >>Probabilistic Prediction in scikit-learn
2021 (English)Conference paper, Published paper (Other academic)
Abstract [en]

Adding confidence measures to predictive models should increase the trustworthiness, but only if the models are well-calibrated. Historically, some algorithms like logistic regression, but also neural networks, have been considered to produce well-calibrated probability estimates off-the-shelf. Other techniques, like decision trees and Naive Bayes, on the other hand, are infamous for being significantly overconfident in their probabilistic predictions. In this paper, a large experimental study is conducted to investigate how well-calibrated models produced by a number of algorithms in the scikit-learn library are out-of-the-box, but also if either the built-in calibration techniques Platt scaling and isotonic regression, or Venn-Abers, can be used to improve the calibration. The results show that of the seven algorithms evaluated, the only one obtaining well-calibrated models without the external calibration is logistic regression. All other algorithms, i.e., decision trees, adaboost, gradient boosting, kNN, naive Bayes and random forest benefit from using any of the calibration techniques. In particular, decision trees, Naive Bayes and the boosted models are substantially improved using external calibration. From a practitioner’s perspective, the obvious recommendation becomes to incorporate calibration when using probabilistic prediction. Comparing the different calibration techniques, Platt scaling and VennAbers generally outperform isotonic regression, on these rather small datasets. Finally, the unique ability of Venn-Abers to output not only well-calibrated probability estimates, but also the confidence in these estimates is demonstrated.

National Category
Information Systems
Research subject
Business and IT
Identifiers
urn:nbn:se:hb:diva-26746 (URN)
Conference
The 18th International Conference on Modeling Decisions for Artificial Intelligence, On-line (from Umeå, Sweden), September 27 - 30, 2021.
Available from: 2021-10-15 Created: 2021-10-15 Last updated: 2021-10-18Bibliographically approved
Sweidan, D., Johansson, U. & Gidenstam, A. (2020). Predicting returns in men’s fashion. In: : . Paper presented at 14th International FLINS Conference (FLINS 2020), Cologne, Germany, 18 – 21 August, 2020 (pp. 1506-1513).
Open this publication in new window or tab >>Predicting returns in men’s fashion
2020 (English)Conference paper, Published paper (Refereed)
Abstract [en]

While consumers value a free and easy return process, the costs to e-tailers associated with returns are substantial and increasing. Consequently, merchants are now tempted to implement stricter policies, but must balance this against the risk of losing valuable customers. With this in mind, data-driven and algorithmic approaches have been introduced to predict if a certain order is likely to result in a return. In this application paper, a novel approach, combining information about the customer and the order, is suggested and evaluated on a real-world data set from a Swedish e-tailer in men’s fashion. The results show that while the predictive accuracy is rather low, a system utilizing the suggested approach could still be useful. Specifically, it is reasonable to assume that an e-tailer would only act on predicted returns where the confidence is very high, e.g., the top 1–5%. For such predictions, the obtained precision is 0.918–0.969, with an acceptable detection rate.

National Category
Computer Sciences
Research subject
Business and IT
Identifiers
urn:nbn:se:hb:diva-24514 (URN)10.1142/9789811223334_0180 (DOI)
Conference
14th International FLINS Conference (FLINS 2020), Cologne, Germany, 18 – 21 August, 2020
Available from: 2020-12-28 Created: 2020-12-28 Last updated: 2023-03-30Bibliographically approved
Giri, C., Johansson, U. & Löfström, T. (2019). Predictive Modeling of Campaigns to Quantify Performance in Fashion Retail Industry. In: 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 2019: . Paper presented at 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 2019..
Open this publication in new window or tab >>Predictive Modeling of Campaigns to Quantify Performance in Fashion Retail Industry
2019 (English)In: 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 2019, 2019Conference paper, Published paper (Refereed)
Abstract [en]

Managing campaigns and promotions effectively is vital for the fashion retail industry. While retailers invest a lot of money in campaigns, customer retention is often very low. At innovative retailers, data-driven methods, aimed at understanding and ultimately optimizing campaigns are introduced. In this application paper, machine learning techniques are employed to analyze data about campaigns and promotions from a leading Swedish e-retailer. More specifically, predictive modeling is used to forecast the profitability and activation of campaigns using different kinds of promotions. In the empirical investigation, regression models are generated to estimate the profitability, and classification models are used to predict the overall success of the campaigns. In both cases, random forests are compared to individual tree models. As expected, the more complex ensembles are more accurate, but the usage of interpretable tree models makes it possible to analyze the underlying relationships, simply by inspecting the trees. In conclusion, the accuracy of the predictive models must be deemed high enough to make these data-driven methods attractive.

National Category
Computer and Information Sciences
Research subject
Business and IT
Identifiers
urn:nbn:se:hb:diva-23012 (URN)10.1109/BigData47090.2019.9005492 (DOI)2-s2.0-85081295913 (Scopus ID)978-1-7281-0858-2 (ISBN)978-1-7281-0859-9 (ISBN)
Conference
2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 2019.
Available from: 2020-03-13 Created: 2020-03-13 Last updated: 2024-02-01Bibliographically approved
Löfström, T., Johansson, U., Balkow, J. & Sundell, H. (2018). A data-driven approach to online fitting services. In: Jun Liu (Ulster University, UK), Jie Lu (University of Technology Sydney, Australia), Yang Xu (Southwest Jiaotong University, China), Luis Martinez (University of Jaén, Spain) and Etienne E Kerre (University of Ghent, Belgium) (Ed.), Data Science and Knowledge Engineering for Sensing Decision Support: . Paper presented at 13th International FLINS Conference, Belfast, August 21-24, 2018. (pp. 1559-1566).
Open this publication in new window or tab >>A data-driven approach to online fitting services
2018 (English)In: Data Science and Knowledge Engineering for Sensing Decision Support / [ed] Jun Liu (Ulster University, UK), Jie Lu (University of Technology Sydney, Australia), Yang Xu (Southwest Jiaotong University, China), Luis Martinez (University of Jaén, Spain) and Etienne E Kerre (University of Ghent, Belgium), 2018, p. 1559-1566Conference paper, Published paper (Refereed)
Abstract [en]

Being able to accurately predict several attributes related to size is vital for services supporting online fitting. In this paper, we investigate a data-driven approach, while comparing two different supervised modeling techniques for predictive regression; standard multiple linear regression and neural networks. Using a fairly large, publicly available, data set of high quality, the main results are somewhat discouraging. Specifically, it is questionable whether key attributes like sleeve length, neck size, waist and chest can be modeled accurately enough using easily accessible input variables as sex, weight and height. This is despite the fact that several services online offer exactly this functionality. For this specific task, the results show that standard linear regression was as accurate as the potentially more powerful neural networks. Most importantly, comparing the predictions to reasonable levels for acceptable errors, it was found that an overwhelming majority of all instances had at least one attribute with an unacceptably high prediction error. In fact, if requiring that all variables are predicted with an acceptable accuracy, less than 5% of all instances met that criterion. Specifically, for females, the success rate was as low as 1.8%.

Keywords
Predictive regression, online fitting, fashion
National Category
Computer Sciences
Research subject
Business and IT
Identifiers
urn:nbn:se:hb:diva-15824 (URN)10.1142/11069 (DOI)
Conference
13th International FLINS Conference, Belfast, August 21-24, 2018.
Projects
Datadriven innovation
Funder
Knowledge Foundation
Available from: 2019-02-25 Created: 2019-02-25 Last updated: 2020-01-29Bibliographically approved
Sundell, H., Löfström, T. & Johansson, U. (2018). Explorative multi-objective optimization of marketing campaigns for the fashion retail industry. In: Jun Liu, Jie Lu, Yang Xu, Luis Martinez and Etienne E Kerre (Ed.), Data Science and Knowledge Engineering for Sensing Decision Support: . Paper presented at FLINS 2018, Belfast, August 21-24, 2018. (pp. 1551-1558).
Open this publication in new window or tab >>Explorative multi-objective optimization of marketing campaigns for the fashion retail industry
2018 (English)In: Data Science and Knowledge Engineering for Sensing Decision Support / [ed] Jun Liu, Jie Lu, Yang Xu, Luis Martinez and Etienne E Kerre, 2018, p. 1551-1558Conference paper, Published paper (Refereed)
Abstract [en]

We show how an exploratory tool for association rule mining can be used for efficient multi-objective optimization of marketing campaigns for companies within the fashion retail industry. We have earlier designed and implemented a novel digital tool for mining of association rules from given basket data. The tool supports efficient finding of frequent itemsets over multiple hierarchies and interactive visualization of corresponding association rules together with numerical attributes. Normally when optimizing a marketing campaign, factors that cause an increased level of activation among the recipients could in fact reduce the profit, i.e., these factors need to be balanced, rather than optimized individually. Using the tool we can identify important factors that influence the search for an optimal campaign in respect to both activation and profit. We show empirical results from a real-world case-study using campaign data from a well-established company within the fashion retail industry, demonstrating how activation and profit can be simultaneously targeted, using computer-generated algorithms as well as human-controlled visualization.

Keywords
Association rules, marketing, visualization, Pareto front
National Category
Computer Sciences
Research subject
Business and IT
Identifiers
urn:nbn:se:hb:diva-15138 (URN)
Conference
FLINS 2018, Belfast, August 21-24, 2018.
Funder
Knowledge Foundation, 20160035
Available from: 2018-10-01 Created: 2018-10-01 Last updated: 2020-01-29Bibliographically approved
Johansson, U., Löfström, T., Sundell, H., Linnusson, H., Gidenstam, A. & Boström, H. (2018). Venn predictors for well-calibrated probability estimation trees. In: Alex J. Gammerman and Vladimir Vovk and Zhiyuan Luo and Evgueni N. Smirnov and Ralf L. M. Peeter (Ed.), 7th Symposium on Conformal and Probabilistic Prediction and Applications: COPA 2018, 11-13 June 2018, Maastricht, The Netherlands. Paper presented at 7th Symposium on Conformal and Probabilistic Prediction and Applications, London, June 11th - 13th, 2018 (pp. 3-14).
Open this publication in new window or tab >>Venn predictors for well-calibrated probability estimation trees
Show others...
2018 (English)In: 7th Symposium on Conformal and Probabilistic Prediction and Applications: COPA 2018, 11-13 June 2018, Maastricht, The Netherlands / [ed] Alex J. Gammerman and Vladimir Vovk and Zhiyuan Luo and Evgueni N. Smirnov and Ralf L. M. Peeter, 2018, p. 3-14Conference paper, Published paper (Refereed)
Abstract [en]

Successful use of probabilistic classification requires well-calibrated probability estimates, i.e., the predicted class probabilities must correspond to the true probabilities. The standard solution is to employ an additional step, transforming the outputs from a classifier into probability estimates. In this paper, Venn predictors are compared to Platt scaling and isotonic regression, for the purpose of producing well-calibrated probabilistic predictions from decision trees. The empirical investigation, using 22 publicly available datasets, showed that the probability estimates from the Venn predictor were extremely well-calibrated. In fact, in a direct comparison using the accepted reliability metric, the Venn predictor estimates were the most exact on every data set.

Series
Proceedings of Machine Learning Research
Keywords
Venn predictors, Calibration, Decision trees, Reliability
National Category
Computer Sciences
Research subject
Business and IT
Identifiers
urn:nbn:se:hb:diva-15061 (URN)
Conference
7th Symposium on Conformal and Probabilistic Prediction and Applications, London, June 11th - 13th, 2018
Funder
Knowledge Foundation
Available from: 2018-09-04 Created: 2018-09-04 Last updated: 2020-01-29Bibliographically approved
König, R., Johansson, U., Riveiro, M. & Brattberg, P. (2017). Modeling Golf Player Skill Using Machine Learning. In: International Cross-Domain Conference for Machine Learning and Knowledge Extraction: CD-MAKE 2017: Machine Learning and Knowledge Extraction. Paper presented at International Cross-Domain Conference, Reggio Italy, August 29 – September 1, 2017. (pp. 275-294). Calabri
Open this publication in new window or tab >>Modeling Golf Player Skill Using Machine Learning
2017 (English)In: International Cross-Domain Conference for Machine Learning and Knowledge Extraction: CD-MAKE 2017: Machine Learning and Knowledge Extraction, Calabri, 2017, p. 275-294Conference paper, Published paper (Refereed)
Abstract [en]

In this study we apply machine learning techniques to Modeling Golf Player Skill using a dataset consisting of 277 golfers. The dataset includes 28 quantitative metrics, related to the club head at impact and ball flight, captured using a Doppler-radar. For modeling, cost-sensitive decision trees and random forest are used to discern between less skilled players and very good ones, i.e., Hackers and Pros. The results show that both random forest and decision trees achieve high predictive accuracy, with regards to true positive rate, accuracy and area under the ROC-curve. A detailed interpretation of the decision trees shows that they concur with modern swing theory, e.g., consistency is very important, while face angle, club path and dynamic loft are the most important evaluated swing factors, when discerning between Hackers and Pros. Most of the Hackers could be identified by a rather large deviation in one of these values compared to the Pros. Hackers, which had less variation in these aspects of the swing, could instead be identified by a steeper swing plane and a lower club speed. The importance of the swing plane is an interesting finding, since it was not expected and is not easy to explain.

Place, publisher, year, edition, pages
Calabri: , 2017
Series
Lecture Notes in Computer Science ; 10410
Keywords
Classification, Decision trees, Machine learning, Golf, Swing analysis
National Category
Computer Sciences
Research subject
Business and IT
Identifiers
urn:nbn:se:hb:diva-13938 (URN)10.1007/978-3-319-66808-6_19 (DOI)000455398500019 ()2-s2.0-85029009266 (Scopus ID)978-3-319-66807-9 (ISBN)
Conference
International Cross-Domain Conference, Reggio Italy, August 29 – September 1, 2017.
Projects
TIKT2 - GOATS - Golf Data Analytics
Funder
Region Västra Götaland
Available from: 2018-04-04 Created: 2018-04-04 Last updated: 2024-02-01Bibliographically approved
Linusson, H., Norinder, U., Boström, H., Johansson, U. & Löfström, T. (2017). On the Calibration of Aggregated Conformal Predictors. In: Proceedings of Machine Learning Research: . Paper presented at Conformal and Probabilistic Prediction and Applications, Stockholm Sweden 13-16 June, 2017.
Open this publication in new window or tab >>On the Calibration of Aggregated Conformal Predictors
Show others...
2017 (English)In: Proceedings of Machine Learning Research, 2017Conference paper, Published paper (Refereed)
Abstract [en]

Conformal prediction is a learning framework that produces models that associate witheach of their predictions a measure of statistically valid confidence. These models are typi-cally constructed on top of traditional machine learning algorithms. An important result ofconformal prediction theory is that the models produced are provably valid under relativelyweak assumptions—in particular, their validity is independent of the specific underlyinglearning algorithm on which they are based. Since validity is automatic, much research onconformal predictors has been focused on improving their informational and computationalefficiency. As part of the efforts in constructing efficient conformal predictors, aggregatedconformal predictors were developed, drawing inspiration from the field of classification andregression ensembles. Unlike early definitions of conformal prediction procedures, the va-lidity of aggregated conformal predictors is not fully understood—while it has been shownthat they might attain empirical exact validity under certain circumstances, their theo-retical validity is conditional on additional assumptions that require further clarification.In this paper, we show why validity is not automatic for aggregated conformal predictors,and provide a revised definition of aggregated conformal predictors that gains approximatevalidity conditional on properties of the underlying learning algorithm.

National Category
Computer Sciences
Identifiers
urn:nbn:se:hb:diva-13636 (URN)
Conference
Conformal and Probabilistic Prediction and Applications, Stockholm Sweden 13-16 June, 2017
Available from: 2018-02-09 Created: 2018-02-09 Last updated: 2020-01-29Bibliographically approved
Johansson, U., Sundström, M., Håkan, S., Rickard, K. & Jenny, B. (2016). Dataanalys för ökad kundförståelse. Stockholm: Handelsrådet
Open this publication in new window or tab >>Dataanalys för ökad kundförståelse
Show others...
2016 (Swedish)Report (Other (popular science, discussion, etc.))
Place, publisher, year, edition, pages
Stockholm: Handelsrådet, 2016. p. 66
National Category
Business Administration
Identifiers
urn:nbn:se:hb:diva-12080 (URN)
Available from: 2017-03-31 Created: 2017-03-31 Last updated: 2017-05-02Bibliographically approved
Organisations

Search in DiVA

Show all publications