Ten serwis zostanie wyłączony 2025-02-11.
Nowa wersja platformy, zawierająca wyłącznie zasoby pełnotekstowe, jest już dostępna.
Przejdź na https://bibliotekanauki.pl
Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników

Znaleziono wyników: 10

Liczba wyników na stronie
first rewind previous Strona / 1 next fast forward last
Wyniki wyszukiwania
Wyszukiwano:
w słowach kluczowych:  topic modelling
help Sortuj według:

help Ogranicz wyniki do:
first rewind previous Strona / 1 next fast forward last
EN
Background: Python is a popular and easy-to-use programming language. It is constantly expanding, with new features and libraries being introduced daily for a broad range of applications. This dynamic expansion needs a robust support structure for developers to effectively utilise the language. Aim: In this study we conduct an in-depth analysis focusing on several research topics to understand the theme of Python questions and identify the challenges that developers encounter, using the questions posted on Stack Overflow. Method:We perform a quantitative and qualitative analysis of Python questions in Stack Overflow. Topic Modelling is also used to determine the most popular and difficult topics among developers. Results: The findings of this study revealed a recent surge in questions about scientific computing libraries pandas and TensorFlow. Also, we observed that the discussion of Data Structures and Formats is more popular in the Python community, whereas areas such as Installation, Deployment, and IDE are still challenging. Conclusion: This study can direct the research and development community to put more emphasis on tackling the actual issues that Python programmers are facing.
EN
The fourth industrial revolution has resulted in technology advancements in the manufacturing industry. However, the innovation potential embedded in these technologies should be unlocked by a viable application, i.e., the business model (BM). The BM as a holistic concept featuring different interacting elements is thus emerging as a promising vehicle for innovation. Current BM research describes the entire domain but lacks depth in the characterization of its individual components. This paper investigates the available manufacturing literature through the lens of the BM concept performing a scientometric analysis. The results are presented in a relational framework that provides an in-depth characterization of the manufacturing element of the BM and highlights identified connections that link the BM components. This is the basis for tools that will support firms in developing manufacturing portfolios aligned with their strategic goals.
3
Content available remote Quality of life in rural areas: A topic for the Rural Development policy?
88%
EN
Contemporary transformations of rural areas involve changes in land uses, economic perspectives, connectivity, livelihoods, but also in lifestyles, whereupon a traditional view of ‘the rural’ and, consequently, of ‘rural development’ no longer holds. Accordingly, EU’s 2007-2013 Rural Development policy (RDP) is one framework to incorporate aspects labelled as quality of life (QOL) alongside traditional rural tenets. With a new rendition of the RDP underway, this paper scopes the content and extent of the expired RDP regarding its incorporation of QOL, in order to better identify considerations for future policy making. Using novel methodology called topic modelling, a series of latent semantic structures within the RDP could be unravelled and re-interpreted via a dual categorization system based on RDP’s own view on QOL, and on definitions provided by independent research. Corroborated by other audits, the findings indicate a thematic overemphasis on agriculture, with the focus on QOL being largely insignificant. Such results point to a rationale different than the assumed one, at the same time reinforcing an outdated view of rurality in the face of the ostensibly fundamental turn towards viewing rural areas in a wider, more humanistic, perspective. This unexpected issue of underrepresentation is next addressed through three possible drivers: conceptual (lingering productionist view of the rural), ideological (capitalist prerogative preventing non-pecuniary values from entering policy) and material (institutional lock-ins incapable of accommodating significant deviations from an agricultural focus). The paper ends with a critical discussion and some reflections on the broader concept of rurality.
EN
The changing social reality, which is increasingly digitally networked, requires new research methods capable of analysing large bodies of data (including textual data). This development poses a challenge for sociology, whose ambition is primarily to describe and explain social reality. As traditional sociological research methods focus on analysing relatively small data, the existential challenge of today involves the need to embrace new methods and techniques, which enable valuable insights into big volumes of data at speed. One such emerging area of investigation involves the application of Natural Language Processing and Machine-Learning to text mining, which allows for swift analyses of vast bodies of textual content. The paper’s main aim is to probe whether such a novel approach, namely, topic modelling based on Latent Dirichlet Allocation (LDA) algorithm, can find meaningful applications within sociology and whether its adaptation makes sociology perform its tasks better. In order to outline the context of the applicability of LDA in the social sciences and humanities, an analysis of abstracts of articles published in journals indexed in Elsevier’s Scopus database on topic modelling was conducted. This study, based on 1,149 abstracts, showed not only the diversity of topics undertaken by researchers but helped to answer the question of whether sociology using topic modelling is “good” sociology in the sense that it provides opportunities for exploration of topic areas and data that would not otherwise be undertaken.
EN
Aware of the challenges faced by the social sciences in publishing a massive volume of research papers, it is worth looking at a novel but no longer so new ways of machine learning for the purposes of literature review. To this end, I explore a probabilistic topic model called Latent Dirichlet Allocation (LDA) in the context of the epistemological challenge of analysing texts on social welfare. This paper aims to describe how the LDA algorithm works for large corpora of data, along with its advantages and disadvantages. This preliminary characterisation of an inductive method for automated text analysis is intended to give a brief overview of how LDA can be used in the social sciences.
EN
Objective: The objective of the paper is to analyse publicly available government policy documents of the United Arab Emirates (UAE) and the Kingdom of Saudi Arabia (KSA) in order to identify key topics and themes for these two countries in relation to the COVID-19 response. Research Design & Methods: In view of the availability of large volumes of documents as well as advancement in computing system, text mining has emerged as a significant tool to analyse large volumes of unstructured data. For this paper, we have applied latent semantic analysis and Singular Value Decomposition (SVD) for text clustering. Findings: The results of the analysis of terms indicate similarities of key themes around health and pandemic for the UAE and the KSA. However, the results of text clustering indicate that focus of the UAE’ documents in on ‘Digital’-related terms, whereas for the KSA, it is around ‘International Travel’-related terms. Further analysis of topic modelling demonstrates that topics such as ‘Vaccine Trial’, ‘Economic Recovery’, ‘Health Ministry’, and ‘Digital Platforms’ are common across both the UAE and the KSA. Contribution / Value Added: The study contributes to text-mining literature by providing a framework for analyzing public policy documents at the country level. This can help to understand the key themes in policies of the governments and can potentially aid the identification of the success and failure of various policy measures in certain cases by means of comparing the outcomes. Implications / Recommendations: The results of this study clearly showed that text clustering of unstructured data such as policy documents could be very useful for understanding the themes and orientation topics of the policies.
PL
This article discusses automatic extraction of relevant words from sets of texts. The author briefly presents three methods aimed to extract the words from the corpus of words with regard to their frequency, or words whose occurrence next to each other is not random. First, he focuses on the keyword analysis method, then he discusses the Zeta method developed by John Burrows and Hugh Craig, and the third method covered in the article is the topic modelling method, which is becoming very popular recently, and consists in finding clusters of words co-occurring in similar contexts. Topic modelling was intended for a quick content search in large collections of documents. On the basis of 100 Polish novels, the article presents how this method can be used for linguistic studies.
PL
Na podstawie archiwów publikacji artykułów indeksowanych w Web of Science Social Sciences Citation Index artykuł analizuje czynniki preferencji w używaniu zwrotu „różnice płciowe” wobec „różnice genderowe” w tytułach oraz słowach kluczowych wybieranych przez autorów. Nasza kwerenda zidentyfikowała 16 362 artykuły, które używają któregoś z tych zwrotów oraz są zaliczane do przynajmniej jednej z dziedzin badań związanych z naukami społecznymi i które zostały opublikowane w latach 1971–2021. W zgodzie z wcześniejszymi badaniami stwierdzamy znaczące przesunięcie w kierunku używania terminu „gender” w latach osiemdziesiątych. Jednakże dla artykułów opublikowanych po 1992 r. rok publikacji ma znikomy wpływ na prawdopodobieństwo użycia terminu „gender” zamiast „sex”, chociaż znaczące różnice trendów występują w podzbiorach zdefiniowanych przez klasyfikacje dyscyplinarne na poziomie artykułu. Wykorzystując dostępne metadane publikacji (rok publikacji, obszar badań, czasopismo publikacji) oraz wyniki modelowania tematycznego (LDA) na tytułach i abstraktach, implementujemy wielopoziomowe modelowanie regresji, aby wykazać, że prawdopodobieństwo odnoszenia się do „gender” zamiast „sex” podlega silnemu oddziaływaniu asocjacji dyscyplinarnych na poziomie artykułu oraz ich klasyfikacji tematycznej. Stwierdzamy, że artykuły z psychologii, które pozostają zdecydowanie najliczniejsze, wykazują niższą skłonność do używania terminu „gender” niż wszystkie inne nauki społeczne, szczególnie przy współpracy z naukami o życiu i biomedycyną.
EN
Based on the publication records of journal articles indexed in the Web of Science Social Sciences Citation Index, our analysis examines the underlying factors influencing the usage of ‘sex differences’ over ‘gender differences’ in Titles and Author Keywords. Our search query identified 16,362 articles published in 1971–2021 that use either of the phrases and have at least one of their Research Areas belonging to the Social Sciences. In concurrence with earlier research, we find a substantial shift towards using ‘gender’ in the 1980s. However, for records published after 1992, the Publication Year has a negligible aggregate impact on the likelihood of ‘gender’ over ‘sex’, although meaningful trend differences occur across subsets defined by article-level disciplinary associations. Using the available publication meta-data (Publication Year, Research Area, Publication Journal) as well as the results of topic modelling (LDA) on Titles and Abstracts, we implement multi-level regression modelling to demonstrate that the likelihood of referring to ‘gender’ rather than ‘sex’ is strongly influenced by article-level disciplinary associations and their topical classification. We find that Psychology articles, by far the most numerous, exhibit a lower propensity to use ‘gender’ than all the other Social Sciences, especially when collaborating with Life Sciences & Biomedicine.
EN
Topic models are very popular methods of text analysis. The most popular algorithm for topic modelling is LDA (Latent Dirichlet Allocation). Recently, many new methods were proposed, that enable the usage of this model in large scale processing. One of the problem is, that a data scientist has to choose the number of topics manually. This step, requires some previous analysis. A few methods were proposed to automatize this step, but none of them works very well if LDA is used as a preprocessing for further classification. In this paper, we propose an ensemble approach which allows us to use more than one model at prediction phase, at the same time, reducing the need of finding a single best number of topics. We have also analyzed a few methods of estimating topic number.
PL
Modelowanie tematyczne, jest popularną metodą analizy tekstów. Jednym z najbardziej popularnych algorytmów modelowania tematycznego jest LDA (Latent Dirichlet Allocation) [14]. W ostatnim czasie zostało zaproponowanych wiele nowych rozszerzeń tego modelu, które pozwalają na przetwarzanie dużych ilości danych. Jednym z problemów podczas użycia algorytmu LDA jest to, że liczba tematów musi zostać wybrana przed uruchomieniem algorytmu. Ten krok, wymaga wcześniejszej analizy i zaangażowania analityka danych. Powstało kilka metod, które pozwalają automatyzować ten krok, ale żadna z nich, nie działa dobrze, gdy LDA jest użyte do redukcji wymiarów przed klasyfikacją danych. W tej pracy, proponujemy podejście oparte o ensemble wielu modeli. Taki model, unika problemu wybrania jednego, najlepszego modelu LDA. Pokażemy, że takie podejście pozwala uzyskać niższy błąd klasyfikacji. Zaproponujemy również, dwie nowe metody wyboru liczby tematów, gdy chcemy użyć tylko pojedynczego modelu.
EN
This article proposes to discuss the voluminous literary correspondence of the Estonian poets Marie Under (1883–1980) and Ivar Ivask (1927–1992), with a focus on its first year, 1957–1958. The whole correspondence comprises 550 letters, with an average length of 4000 (later 3000) words; it is held in the Cultural History Archive of the Estonian Literary Museum in Tartu. Both Under and Ivask had been war refugees, with Under and her husband, poet Artur Adson, finding an exile home near Stockholm, Sweden; Ivask and his wife Astrīde, a well-known Latvian poet emigrated to America after some years spent in DP camps in Germany. Marie Under was already a renowned poet during the Siuru movement in the Estonian Republic, and became a symbol during the Second World War, continuing to publish and hold a large reading audience in exile. In addition to her own poetry, she was a versatile translator of poetry from several languages into Estonian. Ivask, two generations younger than Under, had begun writing in Germany, but continued to search for his linguistic and cultural identity for some time: his mother tongue was Latvian, and the language of his father was Estonian; German was spoken at home. At length and around the time of the beginning of his correspondence with Under, he decided that Estonian would be his poetic language. Since coming to the United States, Ivask completed a PhD in comparative literature and established himself as a scholar and critic in Germanic Studies. He became associated with the publication Books Abroad, later renamed under his editorship as World Literature Today. Under’s and Ivask’s letters are rife with exchanges about core values in poetry, art and worldview, stylistics and poetics, as well as practicalities of publication. After a brief introduction to theoretical approaches to the analysis of letters and correspondences, the article turns to a topical close reading of the letters from Under and Ivask’s first year: main foci included translations of the poetry of Karl Čaks, translation priorities, discussion of the aims and planned trajectory of a new cultural journal in Estonian named Mana (to which both contributed), perspectives on Ivask’s debut as a young poet, the future of Baltic literatures abroad, and the cultural politics in the exile communities over what attitude to take toward literary production from the homeland. The second part of the article applies methods of digital humanities toward an extensive study of the Under-Ivask correspondence as a linguistic dataset, aiming to arrive at a thematic analysis of the text as a whole. The methods enable the identification of key words, word frequencies and thematic clusters, while making the whole corpus digitally accessible to the scholarly reader. The article concludes with proposals for a further study of the Under-Ivask correspondence, using the methods of digital humanities.
first rewind previous Strona / 1 next fast forward last
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.