Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Powiadomienia systemowe
  • Sesja wygasła!
  • Sesja wygasła!
  • Sesja wygasła!
  • Sesja wygasła!
  • Sesja wygasła!

Znaleziono wyników: 5

Liczba wyników na stronie
first rewind previous Strona / 1 next fast forward last
Wyniki wyszukiwania
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 1 next fast forward last
1
Content available remote Key Factors to Consider when Predicting the Costs of Forwarding Contracts
EN
Predicting the cost of forwarding contracts is atypical problem that logistics companies need to solve in order to optimize their business for a better profit. This is the challenge defined in the FedCSIS 2022 Competition where a five-year history of contract data and their delivery routes from a large Polish logistics company are provided to train a Machine Learning model. In addition to the contract data, historical wholesale fuel prices and euro exchange rates at the contract time are also provided. To address this challenge, we first designed a basic solution where we focused on feature engineering to find good impact features for the model. After that, the same set of features were used to train two different models: one using XGBoost and the other using LightGBM. The average predictions of the two boosting models were then used as the predictions for the next post-processing step. Finally, in the post-processing step, we designed and trained a simple linear regression model to capture the average monthly changes of the contract cost, given the changes of the fuel prices and euro exchange rates. These captured changes were used to post-process (adjust) the predictions in the previous step to address the issue that tree-based models could not predict the value that they did not see before. While the basic solution with careful feature selection gave us a place in the top-5, our post-processing strategy in the last step helped us win the 3rd prize in the competition.
EN
A common business practice for transportation forwarders is to bid for shipping contracts at the transport or freight exchanges. Based on the detailed contract requirements they try to estimate the total expected cost of its execution and accordingly bid with the fixed price in advance for delivering such shipping service at the prescribed specification and schedule. The capability to accurately predict the cost of contract execution is the critical factor deciding about the profitability of offered shipping services as well as the amount of business drawn from freight exchanges. However, given highly volatile nature of the transport services ecosystem, it is difficult to simultaneously account for countless dynamically changing factors like fuel prices, currency exchange rates, temporal and spatial multitude of routing and implied traffic risks, the properties of cargo and shipping vehicles etc., which leads to big cost under- or over-estimation resulting with loss-making contracts or equally painful missed revenue opportunities. In the context of FedCSIS 2022 data mining competition we propose an accurate and robust predictor of the cost of forwarding contracts built upon the detailed contract data using the ensemble of the state-of-the-art gradient boosting-based regression models. Our established feature engineering framework combined with deep parametric optimization of the individual models and multi-faceted diversification techniques guiding hybrid final model ensembles were instrumental to outperform all the competitive predictors and win the FedCSIS 2022 contest.
3
Content available remote Deep Bi-Directional LSTM Networks for Device Workload Forecasting
EN
Deep convolutional neural networks revolutionized the area of automated objects detection from images. Can the same be achieved in the domain of time series forecasting? Can one build a universal deep network that once trained on the past would be able to deliver accurate predictions reaching deep into the future for any even most diverse time series? This work is a first step in an attempt to address such a challenge in the context of a FEDCSIS'2020 Competition dedicated to network device workload prediction based on their historical time series data. We have developed and pre-trained a universal 3-layer bi-directional Long-Short-Term-Memory (LSTM) regression network that reported the most accurate hourly predictions of the weekly workload time series from the thousands of different network devices with diverse shape and seasonality profiles. We will also show how intuitive human-led post-processing of the raw LSTM predictions could easily destroy the generalization abilities of such prediction model.
4
Content available remote Greedy incremental support vector regression
EN
Support Vector Regression (SVR) is a powerful supervised machine learning model especially well suited to the normalized or binarized data. However, its quadratic complexity in the number of training examples eliminates it from training on large datasets, especially high dimensional with frequent retraining requirement. We propose a simple two-stage greedy selection of training data for SVR to maximize its validation set accuracy at the minimum number of training examples and illustrate the performance of such strategy in the context of Clash Royale Challenge 2019, concerned with efficient decks' win rate prediction. Hundreds of thousands of labelled data examples were reduced to hundreds, optimized SVR was trained on to maximize the validation R2 score. The proposed model scored the first place in the Cash Royale 2019 challenge, outperforming over hundred of competitive teams from around the world.
5
Content available remote Efficient support vector regression with reduced training data
EN
Support Vector Regression (SVR) as a supervised machine learning algorithm have gained popularity in various fields. However, the quadratic complexity of the SVR in the number of training examples prevents it from many practical applications with large training datasets. This paper aims to explore efficient ways that maximize prediction accuracy of the SVR at the minimum number of training examples. For this purpose, a clustered greedy strategy and a Genetic Algorithm (GA) based approach are proposed for optimal subset selection. The performance of the developed methods has been illustrated in the context of Clash Royale Challenge 2019, concerned with decks' win rate prediction. The training dataset with 100,000 examples were reduced to hundreds, which were fed to SVR training to maximize model prediction performance measured in validation R2 score. Our approach achieved the second highest score among over hundred participating teams in this challenge.
first rewind previous Strona / 1 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.