Blog posts

2199

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

2024

2023

Basic Concepts In Rl

less than 1 minute read

Published:

State: 一个agent相当于environment的一个状态

State space:状态空间,即状态值的集合

Action:对于每个state的可能的动作

Action space of a state:动作集合,但依赖于状态

State transition:当采取一个action时,agent可以从一个state移动到另一个state,定义了agent和environment的一种交互行为,可以用表格表示存在的所有行为,但是只能表示确定性的情况,即deterministic

State transition probability:用概率描述state transition

Policy:策略,一般用$\pai$表示,策略可能是不确定的,stochastic,可以用表格表示

Reward:采取action之后所获得的一个实数,可以用正数代表encouragement,负数表示punishment;reward可以被理解成一种human-machine interface

Trajectory:a state-action-reward chain

Return:沿着trajectory所得到的所有reward总和,通过return可以评价哪个policy好

Discounted return:discount rate $\gamar$,折扣率,避免return发散掉,控制短视和远视

Episode:也是一个trajectory,有限步的,会stop,这样的任务也被称为episode tasks,有些任务是没有terminal states,意味着agent和environment的交互将会永远持续下去,这样的任务称为continuing tasks

Gpt4

less than 1 minute read

Published:

graph TD
	Toolformer_MetaAI_02/2023 --> LLaMA_MetaAI_02/2023 --> VisualChatGPT_Microsoft_08/03/2023 --> GigaGAN_Adobe_09/03/2023 --> Alpaca_Stanford_13/03/2023 --> GPT4_OpenAI_14/03/2023 --> PALM的API_GoogleCloud_14/03/2023 --> Claude_Anthropic_14/03/2023 --> B轮融资3.5亿美元_Adapt.ai_14/03/2023 --> 第五代文生图模型_midjourney_15/03/2023 --> Copilot_Microsoft_16/03/2023

Chatgpt&instructgpt

less than 1 minute read

Published:

1、ChatGPT有安全机制

2、ChatGPT能够理解上下文,大约能记住8000词(GPT-4现在达到了25000词)

3、ChatGPT能够理解自己的局限性

Gpt

less than 1 minute read

Published:

回顾一下GPT系列的论文~

Clip

less than 1 minute read

Published:

CLIP是OpenAI在2021年1月份发布的一个多模态模型,同时还发布了另一个模型是DALL-E。但CLIP和DALL-E有本质的区别,CLIP是是用文本作为监督信号来训练可迁移的视觉模型,DALL-E是基于文本来生成图像的模型。

Leetcode1_twosum_2_twoadd

less than 1 minute read

Published:

对于LeetCode第一道题目两数之和这道题,我们有如下思考:
对于一个元素nums[i],我们需要知道是否存在另一个元素nums[j]的值为target - nums[i],我们用一个哈希表记录每个元素的值到索引的映射,这样就能快速判断数组中是否有一个值为target - nums[i]的元素了。

Chatgpt&newbing

less than 1 minute read

Published:

一、ChatGPT申请过程

首先要保证科学上网的条件(代理)
然后是注册google账号
还要注册一个国外手机号:注册及接收验证码过程
在谷歌浏览器上收一次验证码就行了,之后可以重复登录!

2022

Simclr

5 minute read

Published:

Self-Supervised Learning,又称为自监督学习,一般机器学习分为有监督学习,无监督学习和强化学习。 而 Self-Supervised Learning 是无监督学习里面的一种,主要是希望能够学习到一种通用的特征表达用于下游任务 (Downstream Tasks)。 其主要的方式就是通过自己监督自己。首先是 kaiming 的 MoCo 引发一波热议, Yann Lecun也在 AAAI 上讲 Self-Supervised Learning 是未来的大势所趋。

2015

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

2014

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

2013

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

2012

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.