Data organization

编程入门 行业动态 更新时间:2024-10-12 20:27:26

<a href=https://www.elefans.com/category/jswz/34/1770381.html style=Data organization"/>

Data organization

Content

一、What is Data? 

1.1 data & information 

1.2 attribute & object

1.3 Data types

1.4 Discrete & Continuous

1.5 Measurement

1.6 precision

1.7 Questions


 

一、What is Data? 

Nothing that is difficult or impossible to describe by different persons in the same way is data.

1.1 data & information 

  1. data are "objective".
  2. new observations that nobody witnessed before are not data.
  3. non-transferable content are not data.
  4. data and relationships are the two basic building blocks of any statistical analysis.
  5. "information" is data capable of removing uncertainty.
  6. Data, which we do not know neither need to know, or data which we already know, is not information. Only missing, required data is information.
  7. two major tasks of statisticians and econometrists are the discovery of relationships(modelling) and infeerence.

1.2 attribute & object

two elements of an observation are the "attribute" and the "object".

  1. A "table" is the collection containing attributes observed in objects.
  2. A table's "entity" is what identifies its objects according to a common, relevant characteristic. Each table has an entity.
  3. row = record = case
  4. col = one that identifies the object(key/ index/ reference) + attributes of the object(observation)
  5. not all attributes are variables.
  6. each observation found in an attribute is a "state" or "occurence"

1.3 Data types

  • Names(nominal data)
  • Orders(ordinal data)
  • Scales(scalar data)

comparion of three types of data

Names

  1. Names are the simplest of all forms of observation.
  2. Purely nominal data occupies the lowest level in informational richness.
  3. Labels(A,B,C,D) distinguish one name from the other.
  4. one type of nominal observation is the "binary" observation. A binary attribute where the two states are logically complete is a logical attribute.(logical means one thing but other)
  5. Binary attributes require 1 bit. A nominal attribute with 8 possible states require 3 bits. a customer lives on one amongest 16 possible places, 4 bits are needed.

Ordinal

  1. Ordinal attributes are those where names imply order.
  2. it is important to separate nominal attributes from those with an underlying order.
  3. when we take an order as though it were a simple name, important information is lost.
  4. distance between states are unkown.

Scales 

two types of scales: "interval" and "rational" scales.

  1. scales are ordered observations where the distance between observation is known. 
  2. interval scale: zero does not exist(centigrade temperature); zero value does not mean absence; percentage is meaningless, should use difference; can add or subtract, but cannot multiple and divide.
  3. rational scale: there is a real zero; can perform all arithmetic operations; eg: profitability ratio(not ordinal attribute because we know the distance between states).
  4. most widely used indexes are not scales.

1.4 Discrete & Continuous

Besides names, orders and scales, there are other ways to classify observations. Divides attributes into discrete and continuous.

Discrete

  1. A discrete attibute is one with only a limited number of states.
  2. Names and orders are always discrete. But an attribute has a discrete number of states may not indicate that it is nominal or ordinal.
  3. A discrete attribute can also be a scale.

Continuous

  1. A continuous attribute has endless states.
  2. A continuous attribute is generally a scale.
  3. eg: Money, returns, prices, most ratios, Betas.

1.5 Measurement

To "measure" is to compare an observation with another, taken as the standard.

  • names and orders: to measure is the same as recognize(classify): choose, from a finite collection of standard states, the one that best approximates that observed, including recognition(names) or sort(orders).
  • scales: certain measurements require calculation of a difference, which can change measurement in an unexpected way(logarithms).
  • interval and rational scales: there is the added problem of having to harmonize the unit of measurment, and that of having to deal with a variety of standards.

An important type of measurement is the " counting"(reckoning).

  1. To count is to observe the number of elements in a set.
  2. Classes are names employed in counting. Even where classes are arbitrary, we still call them classes.
  3. The dicision into classes may be arbitrary, also in the case of nominal and ordinal attributes.
  4. The result of a count made by class is a "frequency".
  5. Relative frequency is a frequency expressed as percentage of the total, which is an example of proportion.
  6. Proportion is a percentage of the whole.
  7. Not all percentages are proportions. In a proportion, we cannot observe values above 100%. but percentages can take values above 100% since they are not bounded by a whole.
  8. Relative change/ relative differences: is any difference expressed as a percentage of one of the two states under comparison.
  9. Relative growth: a relative change when a observation can grow but never regress.

1.6 precision

 In the case of scales, informational richness shows itself through "precision". Precision or "accurancy" is the number of digits used to measure, register, or report scale observation.

  1. Precision measures the amount of information.
  2. Unduly invented: If 3 digits are used to measure solvency and 4 digits to report it; Unduly suppressed: If 3 digits are used to measure solvency and only 2 digits to report it.
  3. Zeros that may exist on the left side of a measurement add nothing to precision. eg: 2.34 is as accurate as 0.0234, as both have only 3 significant digits. but is more accurate that 2.3 or 0.0023, which have only 2 significant digits.
  4. Zeros on the right side do not add to precision. eg: 0.0023 and 23.000 indicate generally the same two-digit precision. However, when trailing zeros are truly measured, that is when they are not just rounding, they add to precision.
  5. Certain transformations reduce the richness of data scales. The logarithm of a number is less informative than the number itself. A logarithm is an interval scale, so zero ceases to be objective. Therefore, when applying logarithms, information is lost.

1.7 Questions

  • "class" has the same meaning as "state" of the attribute observed. (T)
  • There are widely used economic indices, which are based on information that does not exist. (T)
  • Never think that the use of too much precision is innocuous and looks good. (T)
  • The datum on a fact of interest is called an observation. The fact of interest is the "object", "subject", "case" or "record".
  • A "state" or "occurrence" of an attribute is each different observation on that attribute.
  • The initial step of any statistical analysis is the description of entities and their relationships.
  • A table is a relationship but there are relationships that are not in the form of a table, namely equations.
  • A collection of seven brokers is names.
  • Interval scale where zero is a convention.
  • The counting of the number of objects by state is a distribution.

更多推荐

Data organization

本文发布于:2024-03-13 06:06:19,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1733336.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:Data   organization

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!