Categories

## Why does the result when restoring a saved DDPG model differ significantly from the result when saving it?

I save the trained model after a certain number of episodes with the special save() function of the DDPG class (the network is saved when the reward reaches zero), but when I restore the model again using saver.restore(), the network gives out a reward equal to approximately -1800. Why is this happening, maybe I’m doing […]

Categories

## How to count the number of letters that exist without repeated

I want a way to count the letters in an string for example: My string : “Hello my friends “ The characters in the string : {H,e,l,o, ,m,y,f,r,i,n,d,s} These letters exist without repeating them (with a blank space) So the result I want is: 13 The goal of all of this is, I want to […]

Categories

## what is the best to signing a message , from javascript or solidity?

i am new in ethereum and smart contract, now i try to make a small DApp using: web3, ganache, Metamask, javascript connect with remix to write a contract, so my DApp have to sign a message and verify it so i am confused i see two ways to sign a message one use javascript function […]

Categories

## ‘utf-8’ codec can’t decode byte – Python

My Django application is working with both .txt and .doc filetypes. And this application opens a file, compares it with other files in db and prints out some report. Now the problem is that, when file type is .txt, I get ‘utf-8′ codec can’t decode byte error (here I’m using encoding=’utf-8′). When I switch encoding=’utf-8′ […]

Categories

## Efficient algorithm to obtain near optimal policies for an MDP

Given a discrete, finite Markov Decision Process (MDP) with its usual parameters $(S, A, T, R, \gamma)$, it is possible to obtain the optimal policy $\pi^{*}$ and the optimal value function $V^{*}$ through one of many planning methods (policy iteration, value iteration or solving a linear program). I am interested in obtaining a random near-optimal […]

Categories

## Efficient algorithm to obtain near optimal policies for an MDP

Given a discrete, finite Markov Decision Process (MDP) with its usual parameters $(S, A, T, R, \gamma)$, it is possible to obtain the optimal policy $\pi^{*}$ and the optimal value function $V^{*}$ through one of many planning methods (policy iteration, value iteration or solving a linear program). I am interested in obtaining a random near-optimal […]

Categories

## When to use known languages/libraries vs. investing in learning new ones?

This question is asked in a general way. In case it is hard to understand, I have added a concrete example below. I am interested in the answer to the general question. I have a lot of experience writing programs of type X, but now I need to write a program of type Y. Everybody […]

Categories

## Can this be a possible deep q learning pseudocode?

I am not using replay here. s – state a – action r – reward n_s – next state q_net – neural network representing q step() { get s,a,r,n_s q_target[s,a]=r+gamma*max(q_net[n_s,:]) loss=mse(q_target,q_net[s,a]) loss.backprop() } while(!terminal) { totalReturn+=step(); }

Categories

## Command to create/connect to an I/O stream?

So I’m trying to connect to a (Lua) debugger embedded in a program by redirecting the I/O. Currently I create a pair of FIFOs for the read and write streams and connect to them using cat /tmp/dbg_write & cat > /tmp/dbg_read. This is workable and pretty straightforward, but if you don’t exit everything just right […]