一個xilinx IP的思考
https://www.eefocus.com/walkie/blog/09-08/174703_f49d6.html
一個朋友問起了xilinx內部IP的調用以及使用的問題,於是整理了一下,放在這裏。
當時的問題是浮點除法器IP可以設置他的延遲從0~28,那麼是不是延遲28的時序會更好,因為相當於做了一個28級的流水。不過相對而言,麵積會更大。這是和朋友討論的最初的結果。但是調用了這個浮點除法器的IP之後,ISE給
出的結果並不是這樣。後來才發現是我們隻調用了IP,但是忘記在這個IP的前後插上寄存器,以至於ISE出來的結果
迷惑了我們。不知道是否也有人遇到同樣的問題,所以整理寫在這裏。希望有用。
目錄
1.整體分析... 1
2.3E start kit上的除法器IP的調用... 2
3.XUPV5-LX110T 板卡上除法器IP的調用6
1.分析
總體來說,和之前的討論結果是一致的,而一開始ISE跑的是有問題的。
―――――――――――――――――――――――
我們先看3E板卡上的情況,即下麵的第二章節
下麵1) 中的結果就是我們之前做的,可以達到758MHz這麼高頻率的設計,其實這個設計是無法實現的,這個
758MHz也不是真正的周期的概念。因為在這個設計中,隻有輸出一級寄存器,而周期的概念就是(需要大於)相鄰兩
級寄存器之前延遲的最大值。這裏隻有一級寄存器,更本沒有兩級來計算這個時鍾周期。所以這個得到的時鍾周期是
有問題的。從布局布線之後我們來查看關鍵路徑我們會發現其實它隻是計算了一級寄存器的延遲。
而2) 中的結果是一個正確的結果。在輸入端放入一級寄存器之後,會發現這個時候這個設計隻能跑到8MHz。這個
時候,我們才真正的得到隻含一個延遲除法器IP的真正的延遲,那就是125ns。從電路結構來說,這個時候的關鍵路
徑就是數據輸入這一級到數據輸出這一級,兩級寄存器(FD)之間的路徑,換句話說就是一個除法器的延遲。
最後3) 中的結果就是,將除法器IP變成內部延遲28個時鍾單位的IP。我們可以發現資源使用變多了(特別是ff以
及LUT),但是時鍾變快了,可以跑到180MHz。
―――――――――――――――――――――――
我們再來看V5板卡上的情況。即第三章節
A)B)C)和3E板卡上的情況基本類似。隻是性能要比3E板卡高。比如3E的2)中隻能跑9MHz,而V5的B)中可以跑
18MHz。當然這裏的這個所謂的兩倍不一定準確,但是可以肯定的是,同樣的電路,在3E上跑和在V5上跑,V5要跑的
更快一些。
2.3E板卡
1) IP內部一個延遲,輸入沒有寄存器,輸出有寄存器緩存。
時序:
Minimum period: 1.319ns (Maximum Frequency: 758.150MHz)
資源:
Device utilization summary:
---------------------------
Selected Device : 3s500efg320-4
Number of Slices: 392 out of 4656 8%
Number of Slice Flip Flops: 33 out of 9312 0%
Number of 4 input LUTs: 744 out of 9312 7%
2) IP內部一個延遲,輸入輸出都有寄存器
時序:
Timing Summary:
---------------
Speed Grade: -4
Minimum period: 126.555ns (Maximum Frequency: 7.902MHz)
=========================================================================
資源:
Device utilization summary:
---------------------------
Selected Device : 3s500efg320-4
Number of Slices: 426 out of 4656 9%
Number of Slice Flip Flops: 97 out of 9312 1%
Number of 4 input LUTs: 744 out of 9312 7%
布局布線出現error,因為這個關鍵路徑太長了,根本無法滿足時序(我設置了周期為20ns)要求。
ERROR:Par:228 - At least one timing constraint is impossible to meet because component delays
alone exceed the
constraint. A timing constraint summary below shows the failing constraints (preceded with
an Asterisk (*)). Please
use the Timing Analyzer (GUI) or TRCE (command line) with the Mapped NCD and PCF files to
identify which constraints
and paths are failing because of the component delays alone. If the failing path(s) is
mapped to Xilinx components as
expected, consider relaxing the constraint. If it is not mapped to components as expected,
re-evaluate your HDL and
how synthesis is optimizing the path. To allow the tools to bypass this error, set the
environment variable
XIL_TIMING_ALLOW_IMPOSSIBLE to 1.
3) 延遲是28個時鍾單位,輸入輸出都有寄存器
時序:
Timing Summary:
---------------
Speed Grade: -4
Minimum period: 5.568ns (Maximum Frequency: 179.610MHz)
資源:
Device utilization summary:
---------------------------
Selected Device : 3s500efg320-4
Number of Slices: 805 out of 4656 17%
Number of Slice Flip Flops: 1420 out of 9312 15%
Number of 4 input LUTs: 800 out of 9312 8%
布局布線之後的時序:
Timing constraint: TS_clk = PERIOD TIMEGRP "clk" 20 ns HIGH 50%;
31903 paths analyzed, 3966 endpoints analyzed, 0 failing endpoints
0 timing errors detected. (0 setup errors, 0 hold errors)
Minimum period is 6.739ns.
--------------------------------------------------------------------------------
Slack: 13.261ns (requirement - (data path - clock path skew + uncertainty))
Source: a_temp_0 (FF)
Destination: uut1/blk00000003/blk000000e3 (FF)
Requirement: 20.000ns
Data Path Delay: 6.717ns (Levels of Logic = 12)
Clock Path Skew: -0.022ns (0.116 - 0.138)
Source Clock: clk_BUFGP rising at 0.000ns
Destination Clock: clk_BUFGP rising at 20.000ns
Clock Uncertainty: 0.000ns
Maximum Data Path: a_temp_0 to uut1/blk00000003/blk000000e3
Delay type Delay(ns) Logical Resource(s)
---------------------------- -------------------
Tcko 0.587 a_temp_0
net (fanout=2) 2.786 a_temp<0>
Topcyf 1.162 uut1/blk00000003/blk00000c59
uut1/blk00000003/blk00000116
uut1/blk00000003/blk00000114
net (fanout=1) 0.000 uut1/blk00000003/sig00000203
Tbyp 0.118 uut1/blk00000003/blk00000112
uut1/blk00000003/blk00000110
net (fanout=1) 0.000 uut1/blk00000003/sig000001ff
Tbyp 0.118 uut1/blk00000003/blk0000010e
uut1/blk00000003/blk0000010c
net (fanout=1) 0.000 uut1/blk00000003/sig000001fb
Tbyp 0.118 uut1/blk00000003/blk0000010a
uut1/blk00000003/blk00000108
net (fanout=1) 0.000 uut1/blk00000003/sig000001f7
Tbyp 0.118 uut1/blk00000003/blk00000106
uut1/blk00000003/blk00000104
net (fanout=1) 0.000 uut1/blk00000003/sig000001f3
Tbyp 0.118 uut1/blk00000003/blk00000102
uut1/blk00000003/blk00000100
net (fanout=1) 0.000 uut1/blk00000003/sig000001ef
Tbyp 0.118 uut1/blk00000003/blk000000fe
uut1/blk00000003/blk000000fc
net (fanout=1) 0.000 uut1/blk00000003/sig000001eb
Tbyp 0.118 uut1/blk00000003/blk000000fa
uut1/blk00000003/blk000000f8
net (fanout=1) 0.000 uut1/blk00000003/sig000001e7
Tbyp 0.118 uut1/blk00000003/blk000000f6
uut1/blk00000003/blk000000f4
net (fanout=1) 0.000 uut1/blk00000003/sig000001e3
Tbyp 0.118 uut1/blk00000003/blk000000f2
uut1/blk00000003/blk000000f0
net (fanout=1) 0.000 uut1/blk00000003/sig000001df
Tbyp 0.118 uut1/blk00000003/blk000000ee
uut1/blk00000003/blk000000ec
net (fanout=1) 0.000 uut1/blk00000003/sig000001db
Tcinck 1.002 uut1/blk00000003/blk000000ea
uut1/blk00000003/blk000000e7
uut1/blk00000003/blk000000e3
---------------------------- ---------------------------
Total 6.717ns (3.931ns logic, 2.786ns route)
(58.5% logic, 41.5% route)
3.XUPV5-LX110T 板卡
A) IP內部一個延遲,輸入沒有寄存器,輸出有寄存器緩存。
時序:
Timing Summary:
---------------
Speed Grade: -1
Minimum period: 0.807ns (Maximum Frequency: 1239.157MHz)
資源:
Device utilization summary:
---------------------------
Selected Device : 5vlx110tff1136-1
Slice Logic Utilization:
Number of Slice Registers: 33 out of 69120 0%
Number of Slice LUTs: 724 out of 69120 1%
Number used as Logic: 724 out of 69120 1%
布局布線之後的時序結果。
Maximum Data Path: uut1/blk00000003/blk00000010 to result_out
Delay type Delay(ns) Logical Resource(s)
---------------------------- -------------------
Tcko 0.450 uut1/blk00000003/blk00000010
net (fanout=1) 1.506 result
Tdick 0.002 result_out
---------------------------- ---------------------------
Total 1.958ns (0.452ns logic, 1.506ns route)
(23.1% logic, 76.9% route)
B) IP內部一個延遲,輸入輸出都有寄存器,代碼同上。
Timing Summary:
---------------
Minimum period: 55.397ns (Maximum Frequency: 18.052MHz)
Device utilization summary:
---------------------------
Selected Device : 5vlx110tff1136-1
Slice Logic Utilization:
Number of Slice Registers: 97 out of 69120 0%
Number of Slice LUTs: 724 out of 69120 1%
Number used as Logic: 724 out of 69120 1%
會發現布局布線還是無法通過,
ERROR:Pack:1653 - At least one timing constraint is impossible to meet because
component delays alone exceed the constraint. A timing constraint summary
below shows the failing constraints (preceded with an Asterisk (*)). Please
use the Timing Analyzer (GUI) or TRCE (command line) with the Mapped NCD and
PCF files to identify which constraints and paths are failing because of the
component delays alone. If the failing path(s) is mapped to Xilinx components
as expected, consider relaxing the constraint. If it is not mapped to
components as expected, re-evaluate your HDL and how synthesis is optimizing
the path. To allow the tools to bypass this error, set the environment
variable XIL_TIMING_ALLOW_IMPOSSIBLE to 1.
因為V5上這個時鍾我們設定為100MHz,而這裏它隻能跑到18MHz
C) IP使用28延遲,輸入輸出都有寄存器,
Timing Summary:
---------------
Minimum period: 2.808ns (Maximum Frequency: 356.125MHz)
Minimum input arrival time before clock: 1.154ns
Device utilization summary:
--------------------------
Slice Logic Utilization:
Number of Slice Registers: 1417 out of 69120 2%
Number of Slice LUTs: 758 out of 69120 1%
Number used as Logic: 721 out of 69120 1%
Number used as Memory: 37 out of 17920 0%
Number used as SRL: 37
布局布線之後的時序為:
Maximum Data Path: uut1/blk00000003/blk0000081e to uut1/blk00000003/blk00000097
Delay type Delay(ns) Logical Resource(s)
---------------------------- -------------------
Tcko 0.450 uut1/blk00000003/blk0000081e
net (fanout=1) 2.154 uut1/blk00000003/sig00000b4e
Tas 0.300 uut1/blk00000003/blk00000d47
uut1/blk00000003/blk00000099
uut1/blk00000003/blk00000097
---------------------------- ---------------------------
Total 2.904ns (0.750ns logic, 2.154ns route)
(25.8% logic, 74.2% route)
最後更新:2017-04-03 05:39:50