select subsamples many times in order to calculate an average value

select subsamples many times in order to calculate an average value

First of all sorry for my english I am a beginner in programming.
In general I am trying to retrieve an index of an input file, which contain some values. The first function selects randomly values in a certain range.
The second function takes the output and calculates an Index for each sample size. However if I execute my script I got a value for each sample size.
But I want to pick the subsamples (200, 400, etc.) multiple times, so that I can calculate the average index for each sample size.

In a first step I passed an input file as an argument to my script.

file_name1 = sys.argv[1]

I selected the values of the input file and saved them to a list, as following:

data2 = [7, 7, 7, 5, 3, 1, 2, 8, 6, 5, 1, 1, 9, 7 ......] #sample size 2010

I wrote a function, which picks randomly numbers from the list within a certain range (0, 200, 400, n). But I want, that my script picks many times (200, 400, 600) values, so that I can calculate the average index of each sample size.

Example:

first script execution gives me for 200 values an index of 4.67 second script execution gives me for 200 values an index of 4.32 third script execution gives me for 200 values an index of 4.52 ...

I need the average index of each sample size.

Below is the function, that picks randomly 200, 400, 600 values and saves same values as key-values in a dictionary

def subsamples(list_object): val = np.array(list_object) n = len(val) count = 0 while (count < n ) count += 200 if (count > n): break subsample = np.random.choice(val, count, replace=False) unique, counts = np.unique(subsample, return_counts=True) group_cat = dict(zip(unique, counts)) pois_group.append(group_cat) return pois_group

Additionally I have a second function that calculates an Index for each sample size.

def index(object): data = subsamples(object) #def p(n, N): #if n is 0: #return 0 #else: #return (float(n)/N) * ln(float(n)/N) for i in data: N = sum(i.values()) #calculate Index sh = -sum(p(n,N) for n in i.values() if n is not 0) index = round(math.exp(sh),2) print("Index: %f, sample size: %s" % (index, N)) y.append(index) x.append(N) return x,y #call the function x_1, y_1= index(data1)

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

搜尋此網誌

Ciugk

select subsamples many times in order to calculate an average value

select subsamples many times in order to calculate an average value

Popular posts from this blog

Visual Studio Code: How to configure includePath for better IntelliSense results

Spring cloud config client Could not locate PropertySource

Makefile test if variable is not empty