Wyniki wyszukiwania - BazTech

1

Feature engineering in property markets homogenous areas determination procedures

Renigier-Biłozor Małgorzata, Janowski Artur, Walacik Marek, Chmielewska Aneta

GIS Odyssey Journal

|

2022

|

Vol. 2, no. 1

57--71

EN

Real estate is one of the most important aspect of our life and play significant role in global economy. Sooner or later, everyone has contact with properties that are place for life, work, investment, relax. That is why properties are part of many decision-making systems related to valuation, taxes, land planning and sustainable development of the areas. Analysis related to property market are based on many assumptions such as property homogeneity determination. The following paper presents proposal of utilization of automated solutions based on robust geo-estimation that enables high efficacy of property submarkets identification. The study is to propose the optimal solutions for initial part of the homogenous market analyses such as feature engineering, that enables unbiassed identification of the homogenous areas (zones). In this case the following methods based on robust geo-estimation/geoprocessing will be used: Gauss filter, geocoding and reverse geocoding, tessellation model and entropy theory.

2

Improving Logical Structure Analysis of Visually Structured Documents with Textual Features

Le Huu-Loi, Trong Nghia Luu, Thanh Huyen Ngo

Annals of Computer Science and Information Systems

|

2022

|

Vol. 33

151--156

EN

This paper introduces a new model to improve the quality of logical structure analysis of visually structured documents. To do that, we extend the model of Koreeda and Manning [1]. In order to enhance textual features, we define a new feature that uses the font size of texts as an indicator. As our observation, the font size is an important indicator that can be used to represent the structure of a document. The new font size feature is combined with visual, textual, and semantic features for training an analyzer. Experimental results on four legal datasets show that the new font size feature contributes to the model and helps to improve the F-scores. The ablation study also shows the contribution of each feature in our model.

3

Feature engineering combined with 1-D convolutional neural network for improved mortality prediction

Verma Rohit, Maheshwari Saumil, Shukla Anupam

Bio-Algorithms and Med-Systems

|

2020

|

Vol. 16, no. 4

art. no. 20200056

EN

Objectives: The appropriate care for patients admitted in Intensive care units (ICUs) is becoming increasingly prominent, thus recognizing the use of machine learning models. The real-time prediction of mortality of patients admitted in ICU has the potential for providing the physician with the interpretable results. With the growing crisis including soaring cost, unsafe care, misdirected care, fragmented care, chronic diseases and evolution of epidemic diseases in the domain of healthcare demands the application of automated and real-time data processing for assuring the improved quality of life. The intensive care units (ICUs) are responsible for generating a wealth of useful data in the form of Electronic Health Record (EHR). This data allows for the development of a prediction tool with perfect knowledge backing. Method: We aimed to build the mortality prediction model on 2012 Physionet Challenge mortality prediction database of 4,000 patients admitted in ICU. The challenges in the dataset, such as high dimensionality, imbalanced distribution and missing values, were tackled with analytical methods and tools via feature engineering and new variable construction. The objective of the research is to utilize the relations among the clinical variables and construct new variables which would establish the effectiveness of 1- Dimensional Convolutional Neural Network (1-D CNN) with constructed features. Results: Its performance with the traditional machine learning algorithms like XGBoost classifier, Light Gradient Boosting Machine (LGBM) classifier, Support Vector Machine (SVM), Decision Tree (DT), K-Neighbours Classifier (K-NN), and Random Forest Classifier (RF) and recurrent models like Long Short-Term Memory (LSTM) and LSTMattention is compared for Area Under Curve (AUC). The investigation reveals the best AUC of 0.848 using 1-D CNN model. Conclusion: The relationship between the various features were recognized. Also, constructed new features using existing ones. Multiple models were tested and compared on different metrics.

4

Analysis of Components for Generalization using Multidimensional Scaling

Damaševičius R.

Fundamenta Informaticae

|

2009

|

Vol. 91, nr 3-4

507-522

EN

To achieve better software quality, to shorten software development time and to lower development costs, software engineers are adopting generative reuse as a software design process. The usage of generic components allows increasing reuse and design productivity in software engineering. Generic component design requires systematic domain analysis to identify similar components as candidates for generalization. However, component feature analysis and identification of components for generalization usually is done ad hoc. In this paper, we propose to apply a data visualization method, called Multidimensional Scaling (MDS), to analyze software components in the multidimensional feature space. Multidimensional data that represent syntactical and semantic features of source code components are mapped to 2D space. The results of MDS are used to partition an initial set of components into groups of similar source code components that can be further used as candidates for generalization. STRESS value is used to estimate the generalizability of a given set of components. Case studies for Java Buffer and Geom class libraries are presented.