Jeżeli nie znalazłeś poszukiwanej książki, skontaktuj się z nami wypełniając formularz kontaktowy.

Ta strona używa plików cookies, by ułatwić korzystanie z serwisu. Mogą Państwo określić warunki przechowywania lub dostępu do plików cookies w swojej przeglądarce zgodnie z polityką prywatności.

Wydawcy

Literatura do programów

Informacje szczegółowe o książce

Data Simplification: Taming Information With Open Source Tools - ISBN 9780128037812

Data Simplification: Taming Information With Open Source Tools

ISBN 9780128037812

Autor: Berman, Jules J.

Wydawca: Elsevier

Dostępność: 3-6 tygodni

Cena: 296,10 zł

Przed złożeniem zamówienia prosimy o kontakt mailowy celem potwierdzenia ceny.


ISBN13:      

9780128037812

Autor:      

Berman, Jules J.

Oprawa:      

Paperback

Rok Wydania:      

2016-03-09

Tematy:      

UB

Data Simplification: Taming Information With Open Source Tools addresses the simple fact that modern data is too big and complex to analyze in its native form. Data simplification is the process whereby large and complex data is rendered usable. Complex data must be simplified before it can be analyzed, but the process of data simplification is anything but simple, requiring a specialized set of skills and tools.

This book provides data scientists from every scientific discipline with the methods and tools to simplify their data for immediate analysis or long-term storage in a form that can be readily repurposed or integrated with other data.

Drawing upon years of practical experience, and using numerous examples and use cases, Jules Berman discusses the principles, methods, and tools that must be studied and mastered to achieve data simplification, open source tools, free utilities and snippets of code that can be reused and repurposed to simplify data, natural language processing and machine translation as a tool to simplify data, and data summarization and visualization and the role they play in making data useful for the end user.



Discusses data simplification principles, methods, and tools that must be studied and masteredProvides open source tools, free utilities, and snippets of code that can be reused and repurposed to simplify dataExplains how to best utilize indexes to search, retrieve, and analyze textual dataShows the data scientist how to apply ontologies, classifications, classes, properties, and instances to data using tried and true methods

Chapter 1. The Simple Life

  • Section 1.1. Simplification drives scientific progress
  • Section 1.2. The human mind is a simplifying machine
  • Section 1.3. Simplification in Nature
  • Section 1.4. The Complexity Barrier
  • Section 1.5. Getting ready

Open Source Tools for Chapter 1

Perl

Python

Ruby

Text Editors

OpenOffice

Command line utilities

Cygwin, Linux emulation for Windows

DOS batch scripts

Linux bash scripts

Interactive line interpreters

Package installers

System calls

References for Chapter 1

Glossary for Chapter 1

 

Chapter 2. Structuring Text

  • Section 2.1. The Meaninglessness of free text
  • Section 2.2. Sorting text, the impossible dream
  • Section 2.3. Sentence Parsing
  • Section 2.4. Abbreviations
  • Section 2.5. Annotation and the simple science of metadata
  • Section 2.6. Specifications Good, Standards Bad

Open Source Tools for Chapter 2

ASCII

Regular expressions

Format commands

Converting non-printable files to plain-text

Dublin Core

References for Chapter 2

Glossary for Chapter 2

 

Chapter 3. Indexing Text

  • Section 3.1. How Data Scientists Use Indexes
  • Section 3.2. Concordances and Indexed Lists
  • Section 3.3. Term Extraction and Simple Indexes
  • Section 3.4. Autoencoding and Indexing with Nomenclatures
  • Section 3.5. Computational Operations on Indexes

Open Source Tools for Chapter 3

Word lists

Doublet lists

Ngram lists

References for Chapter 3

Glossary for Chapter 3

 

Chapter 4. Understanding Your Data

  • Section 4.1. Ranges and Outliers
  • Section 4.2. Simple Statistical Descriptors
  • Section 4.3. Retrieving Image Information
  • Section 4.4. Data Profiling
  • Section 4.5. Reducing data

Open Source Tools for Chapter 4

Gnuplot

MatPlotLib

R, for statistical programming

Numpy

Scipy

ImageMagick

Displaying equations in LaTex

Normalized compression distance

Pearson's correlation

The ridiculously simple dot product

References for Chapter 4

Glossary for Chapter 4

 

Chapter 5. Identifying and Deidentifying Data

  • Section 5.1. Unique Identifiers
  • Section 5.2. Poor Identifiers, Horrific Consequences
  • Section 5.3. Deidentifiers and Reidentifiers
  • Section 5.4. Data Scrubbing
  • Section 5.5. Data Encryption and Authentication
  • Section 5.6. Timestamps, Signatures, and Event Identifiers

Open Source Tools for Chapter 5

Pseudorandom number generators

UUID

Encryption and decryption with OpenSSL

One-way hash implementations

Steganography

References for Chapter 5

Glossary for Chapter 5

 

Chapter 6. Giving Meaning to Data

  • Section 6.1. Meaning and Triples
  • Section 6.2. Driving Down Complexity with Classifications
  • Section 6.3. Driving Up Complexity with Ontologies
  • Section 6.4. The unreasonable effectiveness of classifications
  • Section 6.5. Properties that Cross Multiple Classes

Open Source Tools for Chapter 6

Syntax for triples

RDF Schema

RDF parsers

Visualizing class relationships

References for Chapter 6

Glossary for Chapter 6

 

Chapter 7. Object-oriented data

  • Section 7.1. The Importance of Self-explaining Data
  • Section 7.2. Introspection and Reflection
  • Section 7.3. Object-Oriented Data Objects
  • Section 7.4. Working with Object-Oriented Data
  • Open Source Tools for Chapter 7

Persistent data

Persistence is the ability of data to outlive the program that produced it.

SQLite databases

References for Chapter 7

Glossary for Chapter 7

 

Chapter 8. Problem simplification

  • Section 8.1. Random numbers
  • Section 8.2. Monte Carlo Simulations
  • Section 8.3. Resampling and Permutating
  • Section 8.4. Verification, Validation, and Reanalysis
  • Section 8.5. Data Permanence and Data Immutability

Open Source Tools for Chapter 8

Burrows Wheeler transform

Winnowing and chaffing

References for Chapter 8

Glossary for Chapter 8

 

Koszyk

Książek w koszyku: 0 szt.

Wartość zakupów: 0,00 zł

ebooks
covid

Kontakt

Gambit
Centrum Oprogramowania
i Szkoleń Sp. z o.o.

Al. Pokoju 29b/22-24

31-564 Kraków


Siedziba Księgarni

ul. Kordylewskiego 1

31-542 Kraków

+48 12 410 5991

+48 12 410 5987

+48 12 410 5989

Zobacz na mapie google

Wyślij e-mail

Subskrypcje

Administratorem danych osobowych jest firma Gambit COiS Sp. z o.o. Na podany adres będzie wysyłany wyłącznie biuletyn informacyjny.

Autoryzacja płatności

PayU

Informacje na temat autoryzacji płatności poprzez PayU.

PayU banki

© Copyright 2012: GAMBIT COiS Sp. z o.o. Wszelkie prawa zastrzeżone.

Projekt i wykonanie: Alchemia Studio Reklamy