Future Blog Post
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml
and set future: false
.
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml
and set future: false
.
Published:
Published:
Published:
State: 一个agent相当于environment的一个状态
State space:状态空间,即状态值的集合
Action:对于每个state的可能的动作
Action space of a state:动作集合,但依赖于状态
State transition:当采取一个action时,agent可以从一个state移动到另一个state,定义了agent和environment的一种交互行为,可以用表格表示存在的所有行为,但是只能表示确定性的情况,即deterministic
State transition probability:用概率描述state transition
Policy:策略,一般用$\pai$表示,策略可能是不确定的,stochastic,可以用表格表示
Reward:采取action之后所获得的一个实数,可以用正数代表encouragement,负数表示punishment;reward可以被理解成一种human-machine interface
Trajectory:a state-action-reward chain
Return:沿着trajectory所得到的所有reward总和,通过return可以评价哪个policy好
Discounted return:discount rate $\gamar$,折扣率,避免return发散掉,控制短视和远视
Episode:也是一个trajectory,有限步的,会stop,这样的任务也被称为episode tasks,有些任务是没有terminal states,意味着agent和environment的交互将会永远持续下去,这样的任务称为continuing tasks
Published:
graph TD
Toolformer_MetaAI_02/2023 --> LLaMA_MetaAI_02/2023 --> VisualChatGPT_Microsoft_08/03/2023 --> GigaGAN_Adobe_09/03/2023 --> Alpaca_Stanford_13/03/2023 --> GPT4_OpenAI_14/03/2023 --> PALM的API_GoogleCloud_14/03/2023 --> Claude_Anthropic_14/03/2023 --> B轮融资3.5亿美元_Adapt.ai_14/03/2023 --> 第五代文生图模型_midjourney_15/03/2023 --> Copilot_Microsoft_16/03/2023
Published:
1、ChatGPT有安全机制
2、ChatGPT能够理解上下文,大约能记住8000词(GPT-4现在达到了25000词)
3、ChatGPT能够理解自己的局限性
Published:
Published:
CLIP是OpenAI在2021年1月份发布的一个多模态模型,同时还发布了另一个模型是DALL-E。但CLIP和DALL-E有本质的区别,CLIP是是用文本作为监督信号来训练可迁移的视觉模型,DALL-E是基于文本来生成图像的模型。
Published:
对于LeetCode第一道题目两数之和这道题,我们有如下思考:
对于一个元素nums[i],我们需要知道是否存在另一个元素nums[j]的值为target - nums[i],我们用一个哈希表记录每个元素的值到索引的映射,这样就能快速判断数组中是否有一个值为target - nums[i]的元素了。
Published:
首先要保证科学上网的条件(代理)
然后是注册google账号
还要注册一个国外手机号:注册及接收验证码过程
在谷歌浏览器上收一次验证码就行了,之后可以重复登录!
Published:
Self-Supervised Learning,又称为自监督学习,一般机器学习分为有监督学习,无监督学习和强化学习。 而 Self-Supervised Learning 是无监督学习里面的一种,主要是希望能够学习到一种通用的特征表达用于下游任务 (Downstream Tasks)。 其主要的方式就是通过自己监督自己。首先是 kaiming 的 MoCo 引发一波热议, Yann Lecun也在 AAAI 上讲 Self-Supervised Learning 是未来的大势所趋。
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.