閱讀188 返回首頁    go 阿裏雲 go 技術社區[雲棲]


一個xilinx IP的思考

 https://www.eefocus.com/walkie/blog/09-08/174703_f49d6.html

一個朋友問起了xilinx內部IP的調用以及使用的問題,於是整理了一下,放在這裏。

當時的問題是浮點除法器IP可以設置他的延遲從0~28,那麼是不是延遲28的時序會更好,因為相當於做了一個28級
的流水。不過相對而言,麵積會更大。這是和朋友討論的最初的結果。但是調用了這個浮點除法器的IP之後,ISE給
出的結果並不是這樣。後來才發現是我們隻調用了IP,但是忘記在這個IP的前後插上寄存器,以至於ISE出來的結果
迷惑了我們。不知道是否也有人遇到同樣的問題,所以整理寫在這裏。希望有用。

目錄
1.整體分析... 1
2.3E start kit上的除法器IP的調用... 2
3.XUPV5-LX110T  板卡上除法器IP的調用6

1.分析
總體來說,和之前的討論結果是一致的,而一開始ISE跑的是有問題的。
―――――――――――――――――――――――
我們先看3E板卡上的情況,即下麵的第二章節
下麵1)  中的結果就是我們之前做的,可以達到758MHz這麼高頻率的設計,其實這個設計是無法實現的,這個
758MHz也不是真正的周期的概念。因為在這個設計中,隻有輸出一級寄存器,而周期的概念就是(需要大於)相鄰兩
級寄存器之前延遲的最大值。這裏隻有一級寄存器,更本沒有兩級來計算這個時鍾周期。所以這個得到的時鍾周期是
有問題的。從布局布線之後我們來查看關鍵路徑我們會發現其實它隻是計算了一級寄存器的延遲。
而2)  中的結果是一個正確的結果。在輸入端放入一級寄存器之後,會發現這個時候這個設計隻能跑到8MHz。這個
時候,我們才真正的得到隻含一個延遲除法器IP的真正的延遲,那就是125ns。從電路結構來說,這個時候的關鍵路
徑就是數據輸入這一級到數據輸出這一級,兩級寄存器(FD)之間的路徑,換句話說就是一個除法器的延遲。
最後3)  中的結果就是,將除法器IP變成內部延遲28個時鍾單位的IP。我們可以發現資源使用變多了(特別是ff以
及LUT),但是時鍾變快了,可以跑到180MHz。
―――――――――――――――――――――――
我們再來看V5板卡上的情況。即第三章節
A)B)C)和3E板卡上的情況基本類似。隻是性能要比3E板卡高。比如3E的2)中隻能跑9MHz,而V5的B)中可以跑
18MHz。當然這裏的這個所謂的兩倍不一定準確,但是可以肯定的是,同樣的電路,在3E上跑和在V5上跑,V5要跑的
更快一些。

2.3E板卡
1)  IP內部一個延遲,輸入沒有寄存器,輸出有寄存器緩存。
時序:
   Minimum period: 1.319ns (Maximum Frequency: 758.150MHz)
 資源:
Device utilization summary:
---------------------------
Selected Device : 3s500efg320-4
 Number of Slices:                      392  out of   4656     8%
 Number of Slice Flip Flops:             33  out of   9312     0%
 Number of 4 input LUTs:                744  out of   9312     7%

2) IP內部一個延遲,輸入輸出都有寄存器
時序:
Timing Summary:
---------------
Speed Grade: -4
   Minimum period: 126.555ns (Maximum Frequency: 7.902MHz)
=========================================================================
資源:
Device utilization summary:
---------------------------
Selected Device : 3s500efg320-4
Number of Slices:                      426  out of   4656     9%
 Number of Slice Flip Flops:             97  out of   9312     1%
 Number of 4 input LUTs:                744  out of   9312     7%


布局布線出現error,因為這個關鍵路徑太長了,根本無法滿足時序(我設置了周期為20ns)要求。
ERROR:Par:228 - At least one timing constraint is impossible to meet because component delays
alone exceed the
   constraint. A timing constraint summary below shows the failing constraints (preceded with
an Asterisk (*)). Please
   use the Timing Analyzer (GUI) or TRCE (command line) with the Mapped NCD and PCF files to
identify which constraints
   and paths are failing because of the component delays alone. If the failing path(s) is
mapped to Xilinx components as
   expected, consider relaxing the constraint. If it is not mapped to components as expected,
re-evaluate your HDL and
   how synthesis is optimizing the path. To allow the tools to bypass this error, set the
environment variable
   XIL_TIMING_ALLOW_IMPOSSIBLE to 1.

3)   延遲是28個時鍾單位,輸入輸出都有寄存器
時序:
Timing Summary:
---------------
Speed Grade: -4
  Minimum period: 5.568ns (Maximum Frequency: 179.610MHz)
資源:
Device utilization summary:
---------------------------

Selected Device : 3s500efg320-4
 Number of Slices:                      805  out of   4656    17%
 Number of Slice Flip Flops:           1420  out of   9312    15%
 Number of 4 input LUTs:                800  out of   9312     8%


布局布線之後的時序:

Timing constraint: TS_clk = PERIOD TIMEGRP "clk" 20 ns HIGH 50%;
 31903 paths analyzed, 3966 endpoints analyzed, 0 failing endpoints
 0 timing errors detected. (0 setup errors, 0 hold errors)
 Minimum period is   6.739ns.
--------------------------------------------------------------------------------
Slack:                  13.261ns (requirement - (data path - clock path skew + uncertainty))
  Source:               a_temp_0 (FF)
  Destination:          uut1/blk00000003/blk000000e3 (FF)
  Requirement:          20.000ns
  Data Path Delay:      6.717ns (Levels of Logic = 12)
  Clock Path Skew:      -0.022ns (0.116 - 0.138)
  Source Clock:         clk_BUFGP rising at 0.000ns
  Destination Clock:    clk_BUFGP rising at 20.000ns
  Clock Uncertainty:    0.000ns

  Maximum Data Path: a_temp_0 to uut1/blk00000003/blk000000e3
    Delay type         Delay(ns)  Logical Resource(s)
    ----------------------------  -------------------
    Tcko                  0.587   a_temp_0
    net (fanout=2)        2.786   a_temp<0>
    Topcyf                1.162   uut1/blk00000003/blk00000c59
                                  uut1/blk00000003/blk00000116
                                  uut1/blk00000003/blk00000114
    net (fanout=1)        0.000   uut1/blk00000003/sig00000203
    Tbyp                  0.118   uut1/blk00000003/blk00000112
                                  uut1/blk00000003/blk00000110
    net (fanout=1)        0.000   uut1/blk00000003/sig000001ff
    Tbyp                  0.118   uut1/blk00000003/blk0000010e
                                  uut1/blk00000003/blk0000010c
    net (fanout=1)        0.000   uut1/blk00000003/sig000001fb
    Tbyp                  0.118   uut1/blk00000003/blk0000010a
                                  uut1/blk00000003/blk00000108
    net (fanout=1)        0.000   uut1/blk00000003/sig000001f7
    Tbyp                  0.118   uut1/blk00000003/blk00000106
                                  uut1/blk00000003/blk00000104
    net (fanout=1)        0.000   uut1/blk00000003/sig000001f3
    Tbyp                  0.118   uut1/blk00000003/blk00000102
                                  uut1/blk00000003/blk00000100
    net (fanout=1)        0.000   uut1/blk00000003/sig000001ef
    Tbyp                  0.118   uut1/blk00000003/blk000000fe
                                  uut1/blk00000003/blk000000fc
    net (fanout=1)        0.000   uut1/blk00000003/sig000001eb
    Tbyp                  0.118   uut1/blk00000003/blk000000fa
                                  uut1/blk00000003/blk000000f8
    net (fanout=1)        0.000   uut1/blk00000003/sig000001e7
    Tbyp                  0.118   uut1/blk00000003/blk000000f6
                                  uut1/blk00000003/blk000000f4
    net (fanout=1)        0.000   uut1/blk00000003/sig000001e3
    Tbyp                  0.118   uut1/blk00000003/blk000000f2
                                  uut1/blk00000003/blk000000f0
    net (fanout=1)        0.000   uut1/blk00000003/sig000001df
    Tbyp                  0.118   uut1/blk00000003/blk000000ee
                                  uut1/blk00000003/blk000000ec
    net (fanout=1)        0.000   uut1/blk00000003/sig000001db
    Tcinck                1.002   uut1/blk00000003/blk000000ea
                                  uut1/blk00000003/blk000000e7
                                  uut1/blk00000003/blk000000e3
   ----------------------------  ---------------------------
    Total                 6.717ns (3.931ns logic, 2.786ns route)
                                 (58.5% logic, 41.5% route)

3.XUPV5-LX110T 板卡
A) IP內部一個延遲,輸入沒有寄存器,輸出有寄存器緩存。
時序:
 Timing Summary:
---------------
Speed Grade: -1
   Minimum period: 0.807ns (Maximum Frequency: 1239.157MHz)

資源:

Device utilization summary:
---------------------------
Selected Device : 5vlx110tff1136-1
Slice Logic Utilization:
 Number of Slice Registers:              33  out of  69120     0%
 Number of Slice LUTs:                  724  out of  69120     1%
    Number used as Logic:               724  out of  69120     1%

布局布線之後的時序結果。
  Maximum Data Path: uut1/blk00000003/blk00000010 to result_out
    Delay type         Delay(ns)  Logical Resource(s)
    ----------------------------  -------------------
    Tcko                  0.450   uut1/blk00000003/blk00000010
    net (fanout=1)        1.506   result
    Tdick                 0.002   result_out
   ----------------------------  ---------------------------
    Total                 1.958ns (0.452ns logic, 1.506ns route)
                                  (23.1% logic, 76.9% route)

B) IP內部一個延遲,輸入輸出都有寄存器,代碼同上。
Timing Summary:
---------------
   Minimum period: 55.397ns (Maximum Frequency: 18.052MHz)
 Device utilization summary:
---------------------------

 Selected Device : 5vlx110tff1136-1
Slice Logic Utilization:
 Number of Slice Registers:              97  out of  69120     0%
 Number of Slice LUTs:                  724  out of  69120     1%
    Number used as Logic:               724  out of  69120     1%

會發現布局布線還是無法通過,
ERROR:Pack:1653 - At least one timing constraint is impossible to meet because
   component delays alone exceed the constraint. A timing constraint summary
   below shows the failing constraints (preceded with an Asterisk (*)). Please
   use the Timing Analyzer (GUI) or TRCE (command line) with the Mapped NCD and
   PCF files to identify which constraints and paths are failing because of the
   component delays alone. If the failing path(s) is mapped to Xilinx components
   as expected, consider relaxing the constraint. If it is not mapped to
   components as expected, re-evaluate your HDL and how synthesis is optimizing
   the path. To allow the tools to bypass this error, set the environment
   variable XIL_TIMING_ALLOW_IMPOSSIBLE to 1.
因為V5上這個時鍾我們設定為100MHz,而這裏它隻能跑到18MHz

C) IP使用28延遲,輸入輸出都有寄存器,
Timing Summary:
---------------
   Minimum period: 2.808ns (Maximum Frequency: 356.125MHz)
   Minimum input arrival time before clock: 1.154ns

Device utilization summary:
--------------------------
Slice Logic Utilization:
 Number of Slice Registers:            1417  out of  69120     2%
 Number of Slice LUTs:                  758  out of  69120     1%
    Number used as Logic:               721  out of  69120     1%
    Number used as Memory:               37  out of  17920     0%

     Number used as SRL:               37

布局布線之後的時序為:
Maximum Data Path: uut1/blk00000003/blk0000081e to uut1/blk00000003/blk00000097
   Delay type         Delay(ns)  Logical Resource(s)
    ----------------------------  -------------------
    Tcko                  0.450   uut1/blk00000003/blk0000081e
    net (fanout=1)        2.154   uut1/blk00000003/sig00000b4e
    Tas                   0.300   uut1/blk00000003/blk00000d47
                                  uut1/blk00000003/blk00000099
                               uut1/blk00000003/blk00000097
 ----------------------------  ---------------------------

    Total                 2.904ns (0.750ns logic, 2.154ns route)

                                  (25.8% logic, 74.2% route)

最後更新:2017-04-03 05:39:50

  上一篇:go HDU1050-Moving Tables
  下一篇:go openfire環境搭建